PostgreSQL Weekly News – March 11 2012

== PostgreSQL Weekly News – March 11 2012 ==

PGDay NYC 2012’s schedule of talks for the PGDay NYC is out.

== PostgreSQL Product News ==

DBD::Pg 2.19.1, the Perl interface to PostgreSQL, released.

Benetl 4.0, a free ETL tool for postgreSQL, released.

PostgreSQL Code Factory 12.3, a Windows GUI for PostgreSQL queries and scripts development, released.

Pyrseas 0.5.0, a toolkit for PostgreSQL version control, released on PGXN.

== PostgreSQL Jobs for March ==

== PostgreSQL Local ==

PGDay Austin 2012 will be held March 28.

PGDay DC 2012 will be held on March 30.

PGDay NYC will be held April 2, 2012 at Lighthouse International in
New York City.

PGCon 2012 will be held 17-18 May 2012, in Ottawa at the University of
Ottawa. It will be preceded by two days of tutorials on 15-16 May 2012.

PGDay France will be in Lyon on June 7, 2012.

== PostgreSQL in the News ==

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to, German language
to, Italian language to Spanish language

== Reviews ==

== Applied Patches ==

Peter Eisentraut pushed:

– Add isolation test to check-world and installcheck-world

– libpq: Small code clarification, and avoid casting away const

– psql: Fix invalid memory access. Due to an apparent thinko, when
printing a table in expanded mode (\x), space would be allocated for
1 slot plus 1 byte per line, instead of 1 slot per line plus 1 slot
for the NULL terminator. When the line count is small, reading or
writing the terminator would therefore access memory beyond what was

– libpq: Fix memory leak. If a client encoding is specified as a
connection parameter (or environment variable), internal storage
allocated for it would never be freed.

– psql: Fix memory leak. In expanded auto mode, a lot of allocated
memory was not cleaned up. found by Coverity

– ecpg: Fix rare memory leaks. found by Coverity

– ecpg: Fix off-by-one error in memory copying. In a rare case, one
byte past the end of memory belonging to the sqlca_t structure would
be written to. found by Coverity

– psql: Remove useless code. Apparently a copy-and-paste mistake
introduced in 8ddd22f2456af0155f9c183894f481203e86b76e. found by

– Add support for renaming constraints. reviewed by Josh Berkus and
Dimitri Fontaine

– Add more detail to error message for invalid arguments for server
process. It now prints the argument that was at fault. Also fix a
small misbehavior where the error message issued by getopt() would
complain about a program named “–single”, because that’s what
argv[0] is in the server process.

Tom Lane pushed:

– Improve documentation around logging_collector and use of stderr.
In backup.sgml, point out that you need to be using the logging
collector if you want to log messages from a failing archive_command
script. (This is an oversimplification, in that it will work
without the collector as long as you’re not sending postmaster
stderr to /dev/null; but it seems like a good idea to encourage use
of the collector to avoid problems with multiple processes
concurrently scribbling on one file.) In config.sgml, do some
wordsmithing of logging_collector discussion. Per bug #6518 from
Janning Vygen

– Redesign PlanForeignScan API to allow multiple paths for a foreign
table. The original API specification only allowed an FDW to create
a single access path, which doesn’t seem like a terribly good idea
in hindsight. Instead, move the responsibility for building the
Path node and calling add_path() into the FDW’s PlanForeignScan
function. Now, it can do that more than once if appropriate. There
is no longer any need for the transient FdwPlan struct, so get rid
of that. Etsuro Fujita, Shigeru Hanada, Tom Lane

– Add a hook for processing messages due to be sent to the server log.
Use-cases for this include custom log filtering rules and custom log
message transmission mechanisms (for instance, lossy log message
collection, which has been discussed several times recently). As is
our common practice for hooks, there’s no regression test nor
user-facing documentation for this, though the author did exhibit a
sample module using the hook. Martin Pihlak, reviewed by Marti

– Expose an API for calculating catcache hash values. Now that cache
invalidation callbacks get only a hash value, and not a tuple TID
(per commits 632ae6829f7abda34e15082c91d9dfb3fc0f298b and
b5282aa893e565b7844f8237462cb843438cdd5e), the only way they can
restrict what they invalidate is to know what the hash values mean.
setrefs.c was doing this via a hard-wired assumption but that seems
pretty grotty, and it’ll only get worse as more cases come up. So
let’s expose a calculation function that takes the same parameters
as SearchSysCache. Per complaint from Marko Kreen.

– Add GetForeignColumnOptions() to foreign.c, and add some
documentation. GetForeignColumnOptions provides some abstraction
for accessing column-specific FDW options, on a par with the access
functions that were already provided here for other FDW-related
information. Adjust file_fdw.c to use GetForeignColumnOptions
instead of equivalent hand-rolled code. In addition, add some SGML
documentation for the functions exported by foreign.c that are meant
for use by FDW authors. (This is the fdw_helper portion of the
proposed pgsql_fdw patch.) Hanada Shigeru, reviewed by KaiGai Kohei

– Fix indentation of \d footers for non-ASCII cases. Multi-line
“Inherits:” and “Child tables:” footers were misindented when those
strings’ translations involved multibyte characters, because we were
using strlen() instead of an appropriate display width measurement.
In passing, avoid doing gettext() more than once per loop in these
places. While at it, fix pg_wcswidth(), which has been entirely
broken since about 8.2, but fortunately has been unused for the same
length of time. Report and patch by Sergey Burladyan (bug #6480)

– Improve estimation of IN/NOT IN by assuming array elements are
distinct. In constructs such as “x IN (1,2,3,4)” and “x <>
ALL(ARRAY[1,2,3,4])”, we formerly always used a general-purpose
assumption that the probability of success is independent for each
comparison of “x” to an array element. But in real-world usage of
these constructs, that’s a pretty poor assumption; it’s much saner
to assume that the array elements are distinct and so the match
probabilities are disjoint. Apply that assumption if the operator
appears to behave as equality (for ANY) or inequality (for ALL).
But fall back to the normal independent-probabilities calculation if
this yields an impossible result, ie probability > 1 or < 0. We
could protect ourselves against bad estimates even more by
explicitly checking for equal array elements, but that is expensive
and doesn’t seem worthwhile: doing it would amount to optimizing for
poorly-written queries at the expense of well-written ones. Daniele
Varrazzo and Tom Lane, after a suggestion by Ants Aasma

– Fix some issues with temp/transient tables in extension scripts.
Phil Sorber reported that a rewriting ALTER TABLE within an
extension update script failed, because it creates and then drops a
placeholder table; the drop was being disallowed because the table
was marked as an extension member. We could hack that specific case
but it seems likely that there might be related cases now or in the
future, so the most practical solution seems to be to create an
exception to the general rule that extension member objects can only
be dropped by dropping the owning extension. To wit: if the DROP is
issued within the extension’s own creation or update scripts, we’ll
allow it, implicitly performing an “ALTER EXTENSION DROP object”
first. This will simplify cases such as extension downgrade scripts
anyway. No docs change since we don’t seem to have documented the
idea that you would need ALTER EXTENSION DROP for such an action to
begin with. Also, arrange for explicitly temporary tables to not
get linked as extension members in the first place, and the same for
the magic pg_temp_nnn schemas that are created to hold them. This
prevents assorted unpleasant results if an extension script creates
a temp table: the forced drop at session end would either fail or
remove the entire extension, and neither of those outcomes is
desirable. Note that this doesn’t fix the ALTER TABLE scenario,
since the placeholder table is not temp (unless the table being
rewritten is). Back-patch to 9.1.

– Revise FDW planning API, again. Further reflection shows that a
single callback isn’t very workable if we desire to let FDWs
generate multiple Paths, because that forces the FDW to do all work
necessary to generate a valid Plan node for each Path. Instead
split the former PlanForeignScan API into three steps:
GetForeignRelSize, GetForeignPaths, GetForeignPlan. We had already
bit the bullet of breaking the 9.1 FDW API for 9.2, so this
shouldn’t cause very much additional pain, and it’s substantially
more flexible for complex FDWs. Add an fdw_private field to
RelOptInfo so that the new functions can save state there rather
than possibly having to recalculate information two or three times.
In addition, we’d not thought through what would be needed to allow
an FDW to set up subexpressions of its choice for runtime execution.
We could treat ForeignScan.fdw_private as an executable expression
but that seems likely to break existing FDWs unnecessarily (in
particular, it would restrict the set of node types allowable in
fdw_private to those supported by expression_tree_walker). Instead,
invent a separate field fdw_exprs which will receive the
postprocessing appropriate for expression trees. (One field is
enough since it can be a list of expressions; also, we assume the
corresponding expression state tree(s) will be held within
fdw_state, so we don’t need to add anything to ForeignScanState.)
Per review of Hanada Shigeru’s pgsql_fdw patch. We may need to
tweak this further as we continue to work on that patch, but to me
it feels a lot closer to being right now.

– Restructure SPGiST opclass interface API to support whole-index
scans. The original API definition was incapable of supporting
whole-index scans because there was no way to invoke leaf-value
reconstruction without checking any qual conditions. Also, it was
inefficient for multiple-qual-condition scans because value
reconstruction got done over again for each qual condition, and
because other internal work in the consistent functions likewise had
to be done for each qual. To fix these issues, pass the whole
scankey array to the opclass consistent functions, instead of only
letting them see one item at a time. (Essentially, the loop over
scankey entries is now inside the consistent functions not outside
them. This makes the consistent functions a bit more complicated,
but not unreasonably so.) In itself this commit does nothing except
save a few cycles in multiple-qual-condition index scans, since we
can’t support whole-index scans on SPGiST indexes until nulls are
included in the index. However, I consider this a must-fix for 9.2
because once we release it will get very much harder to change the
opclass API definition.

– Teach SPGiST to store nulls and do whole-index scans. This patch
fixes the other major compatibility-breaking limitation of SPGiST,
that it didn’t store anything for null values of the indexed column,
and so could not support whole-index scans or “x IS NULL” tests.
The approach is to create a wholly separate search tree for the null
entries, and use fixed “allTheSame” insertion and search rules when
processing this tree, instead of calling the index opclass methods.
This way the opclass methods do not need to worry about dealing with
nulls. Catversion bump is for pg_am updates as well as the change
in on-disk format of SPGiST indexes; there are some tweaks in SPGiST
WAL records as well. Heavily rewritten version of a patch by Oleg
Bartunov and Teodor Sigaev. (The original also stored nulls
separately, but it reused GIN code to do so; which required
undesirable compromises in the on-disk format, and would likely lead
to bugs due to the GIN code being required to work in two very
different contexts.)

– Fix documented type of t_infomask2. Per Koizumi Satoru

– Make parameter name consistent with syntax summary. Thomas Hunger

– Make INSERT/UPDATE queries depend on their specific target columns.
We have always created a whole-table dependency for the target
relation, but that’s not really good enough, as it doesn’t prevent
scenarios such as dropping an individual target column or altering
its type. So we have to create an individual dependency for each
target column, as well. Per report from Bill MacArthur of a rule
containing UPDATE breaking after such an alteration. Note that this
patch doesn’t try to make such cases work, only to ensure that the
attempted ALTER TABLE throws an error telling you it can’t cope with
adjusting the rule. This is a long-standing bug, but given the lack
of prior reports I’m not going to risk back-patching it. A
back-patch wouldn’t do anything to fix existing rules’ dependency
lists, anyway.

Bruce Momjian pushed:

– In pg_upgrade, only lock the old cluster if link mode is used, and
do it right after we restore the schema (a common failure point),
and right before we do the link operation. Per suggestions from
Robert Haas and Alvaro Herrera

Heikki Linnakangas pushed:

– Remove extra copies of LogwrtResult. This simplifies the code a
little bit. The new rule is that to update XLogCtl->LogwrtResult,
you must hold both WALWriteLock and info_lck, whereas before we had
two copies, one that was protected by WALWriteLock and another
protected by info_lck. The code that updates them was already
holding both locks, so merging the two is trivial. The third copy,
XLogCtl->Insert.LogwrtResult, was not totally redundant, it was used
in AdvanceXLInsertBuffer to update the backend-local copy, before
acquiring the info_lck to read the up-to-date value. But the value
of that seems dubious; at best it’s saving one spinlock acquisition
per completed WAL page, which is not significant compared to all the
other work involved. And in practice, it’s probably not saving even
that much.

– Simplify the way changes to full_page_writes are logged. It’s
harmless to do full page writes even when not strictly necessary, so
when turning full_page_writes on, we can set the global flag first,
and then call XLogInsert. Likewise, when turning it off, we can
write the WAL record first, and then clear the flag. This way
XLogInsert doesn’t need any special handling of the XLOG_FPW_CHANGE
record type. XLogInsert is complicated enough already, so anything
we can keep away from there is a good thing. Actually I don’t think
the atomicity of the shared memory flag matters, anyway, because we
only write the XLOG_FPW_CHANGE at the end of recovery, when there
are no concurrent WAL insertions going on. But might as well make it
safe, in case we allow changing full_page_writes on the fly in the

– Make the comments more clear on the fact that UpdateFullPageWrites()
is not safe to call concurrently from multiple processes.

– Silence warning about unused variable, when building without

– Update outdated comment. HeapTupleHeader.t_natts field doesn’t exist
anymore. Kevin Grittner

Robert Haas pushed:

– Typo fix. Fujii Masao.

– psql: Avoid some spurious output if the server croaks. Fixes a
regression in commit 08146775acd8bfe0fcc509c71857abb928697171. Noah

– Extend object access hook framework to support arguments, and DROP.
This allows loadable modules to get control at drop time, perhaps
for the purpose of performing additional security checks or to log
the event. The initial purpose of this code is to support sepgsql,
but other applications should be possible as well. KaiGai Kohei,
reviewed by me.

– sepgsql DROP support. KaiGai Kohei

Tatsuo Ishii pushed:

– Add description for –no-locale and –text-search-config.

Michael Meskes pushed:

– Removed redundant “the” from ecpg’s docs. Typo spotted by Erik

== Rejected Patches (for now) ==

No one was disappointed this week

== Pending Patches ==

Pavel Stehule and Alvaro Herrera traded patches for the CHECK FUNCTION

Shigeru HANADA sent in four more revisions of the patch to add a
PostgreSQL FDW.

Kyotaro HORIGUCHI and Marko Kreen traded patches to add a new method
of storing tuples to libpq and use same to make dblink more efficient.

KaiGai Kohei and Yeb Havinga traded patches to add a new
sepgsql.client_label GUC.

Dimitri Fontaine sent in three more revisions of the patch to add
command triggers.

Tomas Vondra sent in two revisions of a patch to fix some regression
test errors that appear in a Czech locale, cs_CZ.

Robert Haas sent in a patch to speed up the creation of error

Bruce Momjian sent in two more revisions of a patch to fix the
documentation for pg_upgrade –logfile.

Alexander Shulgin sent in two more revisions of a patch to support URI
connection strings in libpq.

Pavel Stehule and Petr (PJMODOS) Jelinek traded patches to add CHECK
TRIGGER and related functionality.

Fujii Masao sent in a patch to extend pg_stat_statements so that it
reports the planning time.

Jaime Casanova and Robert Haas traded patches to extend

Jaime Casanova sent in a trimmed-down version of the patch to add GIN
and SP-GiST support to pgstattuple.

Marti Raudsepp sent in a patch to optimize certain cases where IS

Robert Haas sent in a patch to add a pg_prewarm utility.

Fujii Masao sent in a patch to fix a bug in walsender which causes
high CPU usage.

Antonin Houska sent in a WIP patch implementing some sub-cases of
LATERAL for function calls.

Marti Raudsepp sent in another revision of the patch to refactor

Comments are closed.