PostgreSQL Weekly News – August 21 2011
== PostgreSQL Weekly News – August 21 2011 ==
== PostgreSQL Product News ==
MyJSQLView 3.30, a GUI tool that can be used with PostgreSQL, released.
pgpool-II 3.1.0 beta1, a connection pooler and more, released.
A German language tutorial for PostgreSQL 9.0 has been released.
pgwatch 1.0beta2, a monitoring tool for PostgreSQL, released.
== PostgreSQL Jobs for August ==
== PostgreSQL Local ==
Postgres Open 2011, a conference focused on disruption of the database
industry through PostgreSQL, will take place September 14-16, 2011 in
Chicago, Illinois at the Westin Michigan Avenue hotel.
PG-Day Denver 2011 will be held on Friday, October 21st, 2011 at
the Auraria Campus near downtown Denver, Colorado.
PostgreSQL Conference West (#PgWest) will be held September 27th-30th,
2011 at the San Jose Convention center in San Jose, California, USA.
PostgreSQL Conference Europe 2011 will be held on October 18-21 in
pgbr will be in Sao Paulo, Brazil November 3-4, 2011.
PGConf.DE 2011 is the German-speaking PostgreSQL Conference and will
take place on November 11th in the Rheinisches Industriemuseum in
Oberhausen, Germany. Call for Papers is open.
== PostgreSQL in the News ==
Planet PostgreSQL: http://planet.postgresql.org/
PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to email@example.com, German language
to firstname.lastname@example.org, Italian language to email@example.com. Spanish language
== Reviews ==
== Applied Patches ==
Tom Lane pushed:
- Fix unsafe order of operations in foreign-table DDL commands. When
updating or deleting a system catalog tuple, it’s necessary to
acquire RowExclusiveLock on the catalog before looking up the tuple;
otherwise a concurrent VACUUM FULL on the catalog might move the
tuple to a different TID before we can apply the update. Coding
patterns that find the tuple via a table scan aren’t at risk here,
but when obtaining the tuple from a catalog cache, correct ordering
is important; and several routines in foreigncmds.c got it wrong.
Noted while running the regression tests in parallel with VACUUM
FULL of assorted system catalogs. For consistency I moved all the
heap_open calls to the starts of their functions, including a couple
for which there was no actual bug. Back-patch to 8.4 where
foreigncmds.c was added.
- Fix race condition in relcache init file invalidation. The previous
code tried to synchronize by unlinking the init file twice, but that
doesn’t actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that
would tell it the data is stale. The result would be bizarre
failures in catalog accesses, typically “could not read block 0 in
file …” later during startup. Instead, hold RelCacheInitLock
across both the unlink and the sending of the SI messages. This is
more straightforward, and might even be a bit faster since only one
unlink call is needed. This has been wrong since it was put in (in
2002!), so back-patch to all supported releases.
- Preserve toast value OIDs in toast-swap-by-content for
CLUSTER/VACUUM FULL. This works around the problem that a catalog
cache entry might contain a toast pointer that we try to dereference
just as a VACUUM FULL completes on that catalog. We will see the
sinval message on the cache entry when we acquire lock on the toast
table, but by that point we’ve already told tuptoaster.c “here’s the
pointer to fetch”, so it’s difficult from a code structural
standpoint to update the pointer before we use it. Much less
painful to ensure that toast pointers are not invalidated in the
first place. We have to add a bit of code to deal with the case
that a value that previously wasn’t toasted becomes so; but that
should be a seldom-exercised corner case, so the inefficiency
shouldn’t be significant. Back-patch to 9.0. In prior versions, we
didn’t allow CLUSTER on system catalogs, and VACUUM FULL didn’t
result in reassignment of toast OIDs, so there was no problem.
- Fix incorrect order of operations during sinval reset processing.
We have to be sure that we have revalidated each nailed-in-cache
relcache entry before we try to use it to load data for some other
relcache entry. The introduction of “mapped relations” in 9.0 broke
this, because although we updated the state kept in relmapper.c
early enough, we failed to propagate that information into relcache
entries soon enough; in particular, we could try to fetch pg_class
rows out of pg_class before we’d updated its relcache entry’s
rd_node.relNode value from the map. This bug accounts for Dave
Gould’s report of failures after “vacuum full pg_class”, and I
believe that there is risk for other system catalogs as well. The
core part of the fix is to copy relmapper data into the relcache
entries during “phase 1″ in RelationCacheInvalidate(), before
they’ll be used in “phase 2″. To try to future-proof the code
against other similar bugs, I also rearranged the order in which
nailed relations are visited during phase 2: now it’s pg_class
first, then pg_class_oid_index, then other nailed relations. This
should ensure that RelationClearRelation can apply
RelationReloadIndexInfo to all nailed indexes without risking use of
not-yet-revalidated relcache entries. Back-patch to 9.0 where the
relation mapper was introduced.
- Forget about targeting catalog cache invalidations by tuple TID.
The TID isn’t stable enough: we might queue an sinval event before a
VACUUM FULL, and then process it afterwards, when the target tuple
no longer has the same TID. So we must invalidate entries on the
basis of hash value only. The old coding can be shown to result in
various bizarre, hard-to-reproduce errors in the presence of
concurrent VACUUM FULLs on system catalogs, and could easily result
in permanent catalog corruption, up to and including complete loss
of tables. This commit is just a minimal fix that removes the
unsafe comparison. We should remove transmission of the tuple TID
from sinval messages altogether, and then arrange to suppress the
extra message in the common case of a heap_update that doesn’t
change the key hashvalue. But that’s going to be much more
invasive, and will only produce a probably-marginal performance
gain, so it doesn’t seem like material for a back-patch. Back-patch
to 9.0. Before that, VACUUM FULL refused to do any tuple moving if
it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a
tuple that might be the subject of an unsent sinval message.
- Revise sinval code to remove no-longer-used tuple TID from inval
messages. This requires adjusting the API for syscache callback
functions: they now get a hash value, not a TID, to identify the
target tuple. Most of them weren’t paying any attention to that
argument anyway, but plancache did require a small amount of fixing.
Also, improve performance a trifle by avoiding sending duplicate
inval messages when a heap_update isn’t changing the catcache lookup
- Fix two issues in plpython’s handling of composite results. Dropped
columns within a composite type were not handled correctly. Also,
we did not check for whether a composite result type had changed
since we cached the information about it. Jan Urbański, per a bug
report from Jean-Baptiste Quenot
- Update 9.1 release notes to reflect commits through today. Also do
another pass of copy-editing.
- Explain max_prepared_transactions requirement in isolation tests’
README. Now that we have a test that requires nondefault settings
to pass, it seems like we’d better mention that detail in the
directions about how to run the tests. Also do some very minor
- Tag 9.1rc1.
- Fix performance problem when building a lossy tidbitmap. As pointed
out by Sergey Koposov, repeated invocations of tbm_lossify can make
building a large tidbitmap into an O(N
) operation. To fix, make
sure we remove more than the minimum amount of information per call,
and add a fallback path to behave sanely if we're unable to fit the
bitmap within the requested amount of memory. This has been wrong
since the tidbitmap code was written, so back-patch to all supported
Peter Eisentraut pushed:
- Add "Reason code" prefix to internal SSI error messages. This makes
it clearer that the error message is perhaps not supposed to be
understood by users, and it also makes it somewhat clearer that it
was not accidentally omitted from translation. Idea from Heikki
Linnakangas, except that we don't mark "Reason code" for translation
at this point, because that would make the implementation too
- Adjust regression tests for error message change
- Use less cryptic variable names
- Make pg_basebackup progress report translatable. Also fix a
potential portability bug, because INT64_FORMAT is only guaranteed
to be available with snprintf, not fprintf.
- MacOS -> Mac OS. Josh Kupershmidt
- Move r out of translatable strings. The translation tools are very
unhappy about seeing r in translatable strings, so move it to a
separate fprintf call.
- Translation updates
- Improve detection of Python 3.2 installations. Because of ABI
tagging, the library version number might no longer be exactly the
Python version number, so do extra lookups. This affects
installations without a shared library, such as ActiveState's
installer. Also update the way to detect the location of the
'config' directory, which can also be versioned. Ashesh Vashi
- Change PyInit_plpy to external linkage. Module initialization
functions in Python 3 must have external linkage, because
PyMODINIT_FUNC does dllexport on Windows-like platforms. Without
this change, the build with Python 3 fails on Windows.
- Hide unused variable warnings under Python 3
Bruce Momjian pushed:
- In pg_upgrade, avoid dumping orphaned temporary tables. This makes
the pg_upgrade schema matching pattern match pg_dump/pg_dumpall.
Fix for 9.0, 9.1, and 9.2. Report and proposed bug fix by David
- In pg_upgrade, don't copy visibility map files from clusters that
did not have crash-safe visibility maps to clusters that expect
crash-safety. Request from Robert Haas.
- Implement src/tools/copyright as a Perl program, so anyone can run
it. David Fetter
- Add executable bit to file.
- Remove use of 'tie' in perl for copyright.pl; instead use normal
- Fix problem with regex in copyright test. Report and fix by Kris
- Fix copyright.pl to properly us 'tie' function. Kris Jurka
- Have thread_test create its test files in the current directory,
rather than /tmp. Also cleanup C defines and add comments. Per
report by Alex Soto
Heikki Linnakangas pushed:
- Fix bogus comment that claimed that the new BACKUP METHOD line in
backup_label was new in 9.0. Spotted by Fujii Masao.
- If backup-end record is not seen, and we reach end of recovery from
a streamed backup, throw an error and refuse to start up. The
restore has not finished correctly in that case and the data
directory is possibly corrupt. We already errored out in case of
archive recovery, but could not during crash recovery because we
couldn't distinguish between the case that pg_start_backup() was
called and the database then crashed (must not error, data is OK),
and the case that we're restoring from a backup and not all the
needed WAL was replayed (data can be corrupt). To distinguish those
cases, add a line to backup_label to indicate whether the backup was
taken with pg_start/stop_backup(), or by streaming (ie.
pg_basebackup). This is a different implementation than what I
committed to 9.2 a week ago. That implementation was not
back-patchable because it required re-initdb. Fujii Masao
- Fix comment about which version had BACKUP METHOD line in
backup_lable, again. It was invalidated again by Fujii's patch to
- Teach pg_controldata and pg_resetxlog about the new
backupEndRequired field in control file.
- Strip whitespace from SQL blocks in the isolation test suite. This
is purely cosmetic, it removes a lot of IMHO ugly whitespace from
the expected output.
- Add an SSI regression test that tests all interesting permutations
in the order of begin, prepare, and commit of three concurrent
transactions that have conflicts between them. The test runs for a
quite long time, and the expected output file is huge, but this test
caught some serious bugs during development, so seems worthwhile to
keep. The test uses prepared transactions, so it fails if the server
has max_prepared_transactions=0. Because of that, it's marked as
"ignore" in the schedule file. Dan Ports
Magnus Hagander pushed:
- Adjust total size in pg_basebackup progress report when reality
changes. When streaming including WAL, the size estimate will
always be incorrect, since we don't know how much WAL is included.
To make sure the output doesn't look completely unreasonable, this
patch increases the total size whenever we go past the estimate, to
make sure we never go above 100%.
- Adjust wording now that estimated size can increase. Per comment
form Fujii Masao.
Andrew Dunstan pushed:
- Properly handle empty arrays returned from plperl functions. Bug
reported by David Wheeler, fix by Alex Hunsaker.
Robert Haas pushed:
- Remove obsolete README file. Perhaps we ought to add some other
kind of documentation here instead, but for now let's get rid of
this woefully obsolete description of the sinval machinery.
- Make lazy_vacuum_rel call pg_rusage_init only if needed.
do_analyze_rel already does it this way. Euler Taveira de Oliveira
- Typo fix.
- Allow sepgsql regression tests to be run from a user homedir.
KaiGai Kohei, with some changes by me.
- Fix contrib/sepgsql and contrib/xml2 to always link required
libraries. contrib/xml2 can get by without libxslt; the relevant
features just won't work. But if doesn't have libxml2, or if
sepgsql doesn't have libselinux, the link succeeds but the module
then fails to work at load time. To avoid that, link the require
libraries unconditionally, so that it will be clear at link-time
that there is a problem. Per discussion with Tom Lane and KaiGai
- Clean up 'chkselinuxenv' script. Eliminate dependencies on "which",
as we don't really need that to be installed for proper testing.
Don't number the tests, as that increases the footprint of every
patch that wants to add or remove tests. Make the test output more
informative, so that it's a bit easier to see what went right (or
wrong). Spelling and grammar improvements.
== Rejected Patches (for now) ==
No one was disappointed this week :-)
== Pending Patches ==
Joachim Wieland sent in another revision of the patch to provide
facilities for exporting and using snapshots.
Magnus Hagander sent in a patch intended to address some infelicities
in the representation of timestamptzs in replication.
KaiGai Kohei sent in three patches to unify DROP into a single
Heikki Linnakangas and Alexander Korotkov traded new revisions of the
patch to speed up GiST index builds.
Fujii Masao sent in two revisions of a patch to fix some issues in
Jeevan Chalke sent in a patch to allow the same cursor names in nested
Magnus Hagander sent in another revision of the patch to implement
Josh Kupershmidt sent in a patch to fix up the pg_comments view.
Greg Smith sent in a patch that tracks and displays the accumulated
cost when autovacuum is running. Code by Noah Misch and Greg Smith.
Josh Kupershmidt sent in a patch to fix some infelicities in
Shigeru HANADA sent in two more revisions of the patch which gives the
format of FDW options.
KaiGai Kohei sent in two more revisions of the patch to allow access
to the userspace access vector cache.
Wojciech Muła sent in a patch to fix some infelicities in PL/pgsql's
handling of %TYPE in arrays.
Comments are closed.