Re: fsync in glib/gio



On Sat, 14 Mar 2009 20:10:32 -0400 Morten Welinder wrote:

> This is crazy.
> 
> People are actually advocating that thousands upon thousands of
> applications need to be changed.

If they're behaving incorrectly, yes.

But I don't think most of them are.  The only case where *not* doing an
fsync() turns into a problem is if you have a power failure or system
crash at an inopportune time.  So for many applications, that risk
might be acceptable (not all files have equal value).

And the funny thing is: what you think is crazy is EXACTLY what's
being done here.  We're adding a pre-rename fsync() to gio in the read
file -> write tmp file -> rename tmp over original file case.  So
there's app one of those alleged thousands upon thousands that just got
fixed.

> Yes, POSIX allows this particular idiotic behaviour.  So what?  It
> probably also allows free() to do nothing, yet no-one in their right
> mind would want that.

Actually you might in some particularly weird scenarios.  free()
usually causes memory fragmentation, which might not be desirable.  If
you have a fixed amount of RAM and fixed amount of data you're
operating on, a controlled memory 'leak' might be what you want for
best performance.

(Hey, it's a stupid example, but your example was pretty stupid too.  I
figure a stupid example deserves a stupid response.)

>  Or maybe you would be upset if the code
> fragment
> 
>     const char *s = "x";
>     int i = (s+1)-s;
> 
> formatted your hard drive.  Yes, the C standard really does allow that
> to happen.
> (C99 section 6.5.6 #9, if you really want  the details.)

Sorry, dude, but now you're just not making sense.  I just looked up
the C99 spec[1] and read that section.  it doesn't say anything about
formatting your hard drive.  It just talks about how pointer arith must
result in well-defined values inside the object the pointer points to.
Nothing new there.

> The mere fact that a standard allows an idiotic implementation
> doesn't mean we should play ball with it.  The same standard also
> allows sane implementations.

Well what's idiotic and sane is a matter of opinion.  A risk of file
corruption in certain corner cases if you don't follow the spec, in the
name of better performance seems reasonable to me -- for the right
kind of app and the right kind of data.

> We could litter fsync() calls all over, but...
> 
> 1. It describes a semantic that isn't really what we want.  In fact,
> there is no way
>     to get exactly the semantics we want with POSIX.   We have to ask
> for the please-wait-for-the-disk semantics we don't want.  That's a
> sure way of getting
>     sluggish programs.

True, in some cases.

> 2. Shell scripts, Makefiles, and other languages without explicit
> fsync control will
>     kill really you.  Instead of...
> 
>         foo <file >file.new
>         mv file.new file
> 
>     ...you get to write...
> 
>         foo <file >file.new
>         sync
>         mv file.new file

That's a deficiency of the shell, that it doesn't provide something
analogous to fsync, not of the requirement that it must be done.

>     Performance might be affected.

Guess what?  You can't have everything.  Performance and reliability
are often at odds, and you need to find a trade-off you're happy with.
If you have a level of reliability that you can't live with, then you
may have to reduce performance to get what you want.  That's life.
Deal with it.

> 3. Auditing and changing thousands of programs?  Expect bugs.

Arguably, they're already buggy for expecting behavior that isn't
guaranteed.

> We already break the strict letter of POSIX and the C standard in
> fifty different ways.

Wow, this statement is pretty useless and hand-wavy.  Maybe those fifty
different ways are reasonable and ok, and don't risk user data.  Maybe
those fifty different ways don't actually exist and you're just
trash-talking.

> If someone shows up with an environment that doesn't behave as we
> want, we say "sorry, no ball".  Just add stupid file systems to the
> list.

Well apparently 'we' didn't do that: ext4 came out, distros started
using it by default, this issue occurred, and there's pain involved.
Apparently the ext4 devs have caved to pressure and will be adding a
hack to the next version of the driver to order writes to avoid this
specific failure case.  That's all well and good... until the next
filesystem comes along and does something similar.  Or even something
different, but with the same effect.

Again: if you don't like the spec, get it changed or amended!  Then,
later, when this happens again, you can clearly point a finger at the
FS developer and say "yes, this really is your bug."  And who knows,
maybe it won't happen again if there's better behavior defined in the
spec.

	-brian

[1] http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]