Re: fsync in glib/gio



On Sun, 2009-03-15 at 11:31 -0400, Mark Mielke wrote:
> Stef Walter wrote: 
> > Mark Mielke wrote:
> >   
> > fsync() was really broken on ext3. Now, all of a sudden it's "teh
> > awesome!!!! FTW!!!"
> > 
> > There's a reason people haven't been using it. It could take an obscene
> > amount of time to complete depending on what you happened to be doing in
> > elsewhere in the (multi-tasking, no less) OS.
> >   
> 
> This depends on what your priority is. If your priority is to be
> absolutely certain that the file be intact on power failure - then you
> really have no other option.
> 
> Most people have not had this requirement in the past.

Nobody is arguing for this. Its a straw man. All we want is that for the
standard atomic (during runtime) save method to not cause cathastrophic
data loss (i.e. both old and new files) during crashes. 

> The rename to effect atomic-change-in-place is a scenario where you
> want a stronger guarantee. It's not about fsync() being "awesome" -
> it's about it being necessary to achieve this guarantee in a portable
> way - whatever the cost.

The rename method is used because it guarantees (by way of POSIX) that
the update is atomic. I.e. any other process opening the file for
reading the file will at any time find the *entire* contents of either
the new file or the old file, never "no file" and never a partial or
broken file.

That way, if you're e.g. updating /etc/password while someone else is
logging in you'll never run into a problem where /etc/passwd is
partially written and your user is not availibile.

This is why the atomic rename is used, and its not related to any
on-disk guarantee wrt crashes. It has nothing to do really with system
crash safety (witness e.g. the pam /etc/passwd writer doing rename
without fsync).

However, given that POSIX guarantees this during runtime and since its a
very commonly used pattern (due to the runtime guarantees) it would be a
nice property of a filesystem to extend this to post-crash guarantees,
rather than fucking everyone who uses this over by causing 0 byte files
for this commonly used operation. 

Of course, given that POSIX allows this behaviour we should 
probably use the fsync hammer to make the risks for data loss less at
least in some cases. But to argue that such behaviour from the
filesystem is *good*. It boggles the mind.

Anyway, this argument is over for me. XFS has long had problems with
this but they have now changed so that rename overwrite is safe (they
even verify this in their QA runs these days), ext4 will have patches
for this in 2.6.30, and the btrfs maintainer said he will queue similar
patches for 2.6.30. Well probably add the fsync to glib saving in the
"file already exists" case in order to protect against this on other
systems, but the main future linux filesystems at least are sane in
their default configurations.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]