PostgreSQL Weekly News – November 21 2010
== PostgreSQL Weekly News – November 21 2010 ==
== PostgreSQL Product News ==
DiffKit 0.8.2, a table comparison tool with support for PostgreSQL, released.
PostgreDAC 2.6.0, a Delphi/C++ builder for PostgreSQL, released.
== PostgreSQL Jobs for November ==
== PostgreSQL Local ==
The PostgreSQLFr Call for Projects has begun. Projects must target
PostgreSQL and the French-speaking community. Mail appel-projets-2010
AT postgresql DOT fr.
Registration for PGDay.EU 2010 held on December 6-8 in Stuttgart,
Germany is open.
Early bird registration for PGDay.IT, December 10, 2010 in Rome,
Italy, is open and the schedule is online.
FOSDEM is one of the biggest Free and Open Source event held annually
in Brussels, Belgium, on February 5-6, 2011, and attended by around
4000 people. Send in your proposal to fosdem AT postgresql DOT eu.
PGDay is scheduled at this year’s Southern California Linux Exposition
(SCALE) held in the LAX Hilton Hotel in the city of Los Angeles,
California, on Friday February 25th, 2011. Post your talk submission
to pgday-submissions AT googlegroups DOT com.
PGCon will be May 19-20, 2011 at the University of Ottawa, preceded by
two days of tutorials on May 17-18.
== PostgreSQL in the News ==
Planet PostgreSQL: http://planet.postgresql.org/
PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to email@example.com, German language
to firstname.lastname@example.org, Italian language to email@example.com. Spanish language
== Reviews ==
Steve Singer reviewed the patch to extend ALTER TABLE to allow
creating a primary key from an extant index, suggesting some changes
in the documentation and error handling.
Robert Haas reviewed the KNN-GiST patch, pointing out places where it
re-uses machinery with now-misleading names, along with some cases it
Robert Haas reviewed the patch for ALTER OBJECT … SET SCHEMA, noting
a lack of regression tests and tab completion support for psql.
== Applied Patches ==
Simon Riggs pushed:
– Avoid spurious Hot Standby conflicts from btree delete records.
Similar conflicts were already avoided for related record types.
Massive over-caution resulted in a usability bug. Clear theoretical
basis for doing this is now confirmed by me. Request to remove from
Heikki (twice), over-caution by me.
Alvaro Herrera pushed:
– Fix copy-and-pasteo.
– Add pg_describe_object function. This function is useful to obtain
textual descriptions of objects as stored in pg_depend.
Robert Haas pushed:
– Fix copy-and-pasteo a little more completely. copydir.c is no
longer in src/port
– Remove outdated comments from the regression test files. Since
2004, int2 and int4 operators do detect overflow; this was fixed by
commit 4171bb869f234281a13bb862d3b1e577bf336242. Extracted from a
larger patch by Andres Freund.
– Add new buffers_backend_fsync field to pg_stat_bgwriter. This new
field counts the number of times that a backend which writes a
buffer out to the OS must also fsync() it. This happens when the
bgwriter fsync request queue is full, and is generally detrimental
to performance, so it’s good to know when it’s happening. Along the
way, log a new message at level DEBUG1 whenever we fail to hand off
an fsync, so that the problem can also be seen in examination of log
files (if the logging level is cranked up high enough). Greg Smith,
with minor tweaks by me.
– Document the dummy_seclabel contrib module. KaiGai Kohei, with
editing and markup fixes by me.
– Minor corrections to dummy_seclabel documentation. Problems noted
by Thom Brown.
– Speed up conversion of signed integers to C strings. A hand-coded
implementation turns out to be much faster than calling printf().
In passing, add a few more regresion tests. Andres Freund, with
assorted, mostly cosmetic changes.
– Attempt to fix breakage caused by signed integer conversion patch.
Use INT_MIN rather than INT32_MIN as we do elsewhere in the code,
and try to work around nonexistence of INT64_MIN if necessary.
Adjust the new regression tests to something hopefully saner, per
observation by Tom Lane.
– Update int8-exp-three-digits.out to match new contents of int8.out.
– Expose quote_literal_cstr() from core. This eliminates the need for
inefficient implementions of this functionality in both
contrib/dblink and contrib/tablefunc, so remove them. The upcoming
patch implementing an in-core format() function will also require
this functionality. In passing, add some regression tests.
– Minor cleanup of indxpath.c. Eliminate some superfluous notational
complexity around match_clause_to_indexcol(), and rip out the
– Add new SQL function, format(text). Currently, three conversion
format specifiers are supported: %s for a string, %L for an SQL
literal, and %I for an SQL identifier. The latter two are
deliberately designed not to overlap with what sprintf() already
supports, in case we want to add more of sprintf()’s functionality
here later. Patch by Pavel Stehule, heavily revised by me.
Reviewed by Jeff Janes and, in earlier versions, by Itagaki Takahiro
and Tom Lane.
– Bump catversion. Should have done this as part of format(text)
– Put back accidentally-deleted quote_literal() regression tests.
Tom Lane pushed:
– Fix aboriginal mistake in plpython’s set-returning-function support.
We must stay in the function’s SPI context until done calling the
iterator that returns the set result. Otherwise, any attempt to
invoke SPI features in the python code called by the iterator will
malfunction. Diagnosis and patch by Jan Urbanski, per bug report
from Jean-Baptiste Quenot. Back-patch to 8.2; there was no support
for SRFs in previous versions of plpython.
– Make TRUNCATE … RESTART IDENTITY restart sequences
transactionally. In the previous coding, we simply issued ALTER
SEQUENCE RESTART commands, which do not roll back on error. This
meant that an error between truncating and committing left the
sequences out of sync with the table contents, with potentially bad
consequences as were noted in a Warning on the TRUNCATE man page.
To fix, create a new storage file (relfilenode) for a sequence that
is to be reset due to RESTART IDENTITY. If the transaction aborts,
we’ll automatically revert to the old storage file. This acts just
like a rewriting ALTER TABLE operation. A penalty is that we have
to take exclusive lock on the sequence, but since we’ve already got
exclusive lock on its owning table, that seems unlikely to be much
of a problem. The interaction of this with usual nontransactional
behaviors of sequence operations is a bit weird, but it’s hard to
see what would be completely consistent. Our choice is to discard
cached-but-unissued sequence values both when the RESTART is
executed, and at rollback if any; but to not touch the currval()
state either time. In passing, move the sequence reset operations
to happen before not after any AFTER TRUNCATE triggers are fired.
The previous ordering was not logically sensible, but was forced by
the need to minimize inconsistency if the triggers caused an error.
Transactional rollback is a much better solution to that. Patch by
Steve Singer, rather heavily adjusted by me.
– Further fallout from the MergeAppend patch. Fix things so that
top-N sorting can be used in child Sort nodes of a MergeAppend node,
when there is a LIMIT and no intervening joins or grouping.
Actually doing this on the executor side isn’t too bad, but it’s a
bit messier to get the planner to cost it properly. Per gripe from
Robert Haas. In passing, fix an oversight in the original
top-N-sorting patch: query_planner should not assume that a LIMIT
can be used to make an explicit sort cheaper when there will be
grouping or aggregation in between. Possibly this should be
back-patched, but I’m not sure the mistake is serious enough to be a
real problem in practice.
– Dept of second thoughts: don’t try to push LIMIT below a SRF. If we
have Limit->Result->Sort, the Result might be projecting a tlist
that contains a set-returning function. If so, it’s possible for
the SRF to sometimes return zero rows, which means we could need to
fetch more than N rows from the Sort in order to satisfy LIMIT N.
So top-N sorting cannot be used in this scenario.
– Improve plpgsql’s error reporting for no-such-column cases. Given a
column reference foo.bar, where there is a composite plpgsql
variable foo but it doesn’t contain a column bar, the pre-9.0 coding
would immediately throw a “record foo has no field bar” error. In
9.0 the parser hook instead falls through to let the core parser see
if it can resolve the reference. If not, you get a complaint about
“missing FROM-clause entry for table foo”, which while in some sense
correct isn’t terribly helpful. Complicate things a bit so that we
can throw the old error message if neither the core parser nor the
hook are able to resolve the column reference, while not changing
the behavior in any other case. Per bug #5757 from Andrey Galkin.
– Improve relation width estimation for subqueries. As per the
ancient comment for set_rel_width, it really wasn’t much good for
relations that aren’t plain tables: it would never find any stats
and would always fall back on datatype-based estimates, which are
often pretty silly. Fix that by copying up width estimates from the
subquery planning process. At some point we might want to do this
for CTEs too, but that would be a significantly more invasive patch
because the sub-PlannerInfo is no longer accessible by the time it’s
needed. I refrained from doing anything about that, partly for fear
of breaking the unmerged CTE-related patches. In passing, also
generate less bogus width estimates for whole-row Vars. Per a gripe
from Jon Nelson.
– Fix leakage of cost_limit when multiple autovacuum workers are
active. When using default autovacuum_vac_cost_limit,
autovac_balance_cost relied on VacuumCostLimit to contain the
correct global value … but after the first time through in a
particular worker process, it didn’t, because we’d trashed it in
previous iterations. Depending on the state of other autovac
workers, this could result in a steady reduction of the effective
cost_limit setting as a particular worker processed more and more
tables, causing it to go slower and slower. Spotted by Simon Poole
(bug #5759). Fix by saving and restoring the GUC variables in the
loop in do_autovacuum. In passing, improve a few comments.
Back-patch to 8.3 … the cost rebalancing code has been buggy since
it was put in.
– Assorted further cleanup for integer-conversion patch. Avoid
depending on LL notation, which is likely to not work in pre-C99
compilers; don’t pointlessly use INT32_MIN/INT64_MIN in code that
has the numerical value hard-wired into it anyway; remove some
gratuitous style inconsistencies between pg_ltoa and pg_lltoa; fix
int2 test case so it actually tests int2.
– Further cleanup of indxpath logic related to IndexOptInfo.opfamily
array. We no longer need the terminating zero entry in opfamily,
so get rid of it. Also replace assorted ad-hoc looping logic with
simple for and foreach constructs. This code is now noticeably more
readable than it was an hour ago; credit to Robert for seeing that
it could be simplified.
Heikki Linnakangas pushed:
– The GiST scan algorithm uses LSNs to detect concurrent pages splits,
but temporary indexes are not WAL-logged. We used a constant LSN
for temporary indexes, on the assumption that we don’t need to worry
about concurrent page splits in temporary indexes because they’re
only visible to the current session. But that assumption is wrong,
it’s possible to insert rows and split pages in the same session,
while a scan is in progress. For example, by opening a cursor and
fetching some rows, and INSERTing new rows before fetching some
more. Fix by generating fake increasing LSNs, used in place of
real LSNs in temporary GiST indexes.
– Remove unused parameter. Patch by Shigeru Hanada.
Magnus Hagander pushed:
– Send paramHandle to subprocesses as 64-bit on Win64. The handle to
the shared memory segment containing startup parameters was sent as
32-bit even on 64-bit systems. Since HANDLEs appear to be allocated
sequentially this shouldn’t be a problem until we reach 232 open
handles in the postmaster, but a 64-bit value should be sent across
as 64-bit, and not zero out the top 32 bits. Noted by Tom Lane.
Andrew Dunstan pushed:
– Require VALUE keyword when extending an enum type. Based on a patch
from Alvaro Herrera.
Peter Eisentraut pushed:
– Additional fixes for parallel make. Add some additional
dependencies to constrain the build order to prevent parallel make
from failing. In the case of src/Makefile, this is likely to be too
complicated to be worth maintaining, so just add .NOTPARALLEL to get
the old for-loop-like behavior. More fine-tuning might be necessary
for some platforms or configurations.
== Rejected Patches (for now) ==
No one was disappointed this week
== Pending Patches ==
Alexander Korotkov sent in two revisions of a patch to fix the
picksplit function for contrib/seg.
Robert Haas sent in another revision of the patch to add custom code
for int(32|64)-to-text conversions. These are much faster than the
Peter Eisentraut sent in a patch intended to fix some misbehavior in
parallel make on Darwin.
Peter Eisentraut sent in another revision of the patch to add
Greg Sabino Mullane sent in a patch to fix an infelicity with the
clang compiler, which doesn’t like to see gcc directives.
Robert Haas sent in another set of WIP patches to implement unlogged
Heikki Linnakangas sent in a patch to rewrite how GiST does its
insertions for better behavior with WAL logging.
KaiGai Kohei sent in another revision of the patch to allow switching
Alvaro Herrera sent in a patch to add a describe_object function.
ITAGAKI Takahiro sent in another WIP patch to implement MULTISETs and
add some new functions for ARRAYs.
Fujii Masao sent in a patch to fix the documentation of libpq for
Marko (johto) Tiikkaja sent in a patch to have EXPLAIN ANALYZE show
the nubmer of rows a plan qual filtered in from a node’s input.
Pavel Stehule sent in another revision of the patch to add formatting
Shigeru HANADA sent in another revision of the SQL/MED patch.
Bruce Momjian sent in another revision of the patch to fix some errors
in libpq’s handling of host and addresses for IPv4 and IPv6.
Radoslaw Smogura sent in an experimental JDBC driver.
Dimitri Fontaine sent in another revision of the patch to add ALTER
OBJECT … SET SCHEMA to those objects where this is not already
Joachim Wieland sent in two more patches in support of parallel
pg_dump, one for modular I/O compression in pg_dump, the other for a
directory-structured dump format.
Comments are closed.