Quantcast

UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

classic Classic list List threaded Threaded
54 messages Options
123
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Clint O
Sep 21 08:57:54 belle fsck: /dev/ad4s1d: 1 DUP I=190
Sep 21 08:57:54 belle fsck: /dev/ad4s1d: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.

Ok, so I ran fsck manually (even with -y), but yet it refuses to clear/fix
whatever to the questions posed as fsck runs.  What does this all mean?

Thanks,

-Clint

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
On Sun, Sep 21, 2008 at 02:34:26PM -0700, Clint Olsen wrote:
> Sep 21 08:57:54 belle fsck: /dev/ad4s1d: 1 DUP I=190
> Sep 21 08:57:54 belle fsck: /dev/ad4s1d: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.
>
> Ok, so I ran fsck manually (even with -y), but yet it refuses to clear/fix
> whatever to the questions posed as fsck runs.  What does this all mean?

Are you running fsck on the filesystem while its mounted?  Are you doing
this in single-user or multi-user mode?

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
On Sun, Sep 21, 2008 at 02:59:30PM -0700, Clint Olsen wrote:

> I ran in multi-user mode because the system booted.  I figured that it
> would have halted the boot if it was serious enough to warrant single-user
> mode fsck.  That has happened before.
>
> Thanks,
>
> -Clint
>
> On Sep 21, Jeremy Chadwick wrote:
> > Are you running fsck on the filesystem while its mounted?  Are you doing
> > this in single-user or multi-user mode?

Re-adding mailing list to the CC list.

No, I don't think that is the case, assuming the filesystems are UFS2
and are using softupdates.  When booting multi-user, fsck is run in the
background, meaning the system is fully up + usable even before the fsck
has started.

Consider using background_fsck="no" in /etc/rc.conf if you prefer the
old behaviour.  Otherwise, boot single-user then do the fsck.

You could also consider using clri(8) to clear the inode (190).  Do this
in single-user while the filesystem is not mounted.  After using clri,
run fsck a couple times.

Also, are there any kernel messages about ATA/SCSI disk errors or other
anomalies?

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Clint O
On Sep 21, Jeremy Chadwick wrote:
> Re-adding mailing list to the CC list.
>
> No, I don't think that is the case, assuming the filesystems are UFS2
> and are using softupdates.  When booting multi-user, fsck is run in the
> background, meaning the system is fully up + usable even before the fsck
> has started.

The last time things crashed hard, the boot sequence was halted in order to
run fsck.
 
> Consider using background_fsck="no" in /etc/rc.conf if you prefer the
> old behaviour.  Otherwise, boot single-user then do the fsck.

Yes, I'll do this.
 
> You could also consider using clri(8) to clear the inode (190).  Do this
> in single-user while the filesystem is not mounted.  After using clri,
> run fsck a couple times.

Ok, thanks.
 
> Also, are there any kernel messages about ATA/SCSI disk errors or other
> anomalies?

None.  In fact smartctl will not do anything now.  It just prints out the
quick banner message and exits immediately with no error.

-Clint

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Clint O
In reply to this post by Jeremy Chadwick-3
On Sep 21, Jeremy Chadwick wrote:
> You could also consider using clri(8) to clear the inode (190).  Do this
> in single-user while the filesystem is not mounted.  After using clri,
> run fsck a couple times.

Booting single-user and running fsck again seems to have corrected these
errors.  For some reason it said another disk was not properly dismounted
(/dev/ad0s1d - /home) and so it's running fsck in the background since I
booted.

-Clint

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
In reply to this post by Clint O
On Sun, Sep 21, 2008 at 04:59:50PM -0700, Clint Olsen wrote:
> > Also, are there any kernel messages about ATA/SCSI disk errors or other
> > anomalies?
>
> None.  In fact smartctl will not do anything now.  It just prints out the
> quick banner message and exits immediately with no error.

With regards to this specific item: can you provide the full smartctl
command you're using (including device), and all of the output?  I have
an idea of what the problem is, but I'd need to see the output first.

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Clint O
On Sep 21, Jeremy Chadwick wrote:
> With regards to this specific item: can you provide the full smartctl
> command you're using (including device), and all of the output?  I have
> an idea of what the problem is, but I'd need to see the output first.

# smartctl /dev/ad6
smartctl version 5.38 [i386-portbld-freebsd6.3] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

-Clint

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
On Sun, Sep 21, 2008 at 05:40:40PM -0700, Clint Olsen wrote:
> On Sep 21, Jeremy Chadwick wrote:
> > With regards to this specific item: can you provide the full smartctl
> > command you're using (including device), and all of the output?  I have
> > an idea of what the problem is, but I'd need to see the output first.
>
> # smartctl /dev/ad6
> smartctl version 5.38 [i386-portbld-freebsd6.3] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/

The tool is behaving how it should.  Try using the -a flag.

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Clint O
On Sep 21, Jeremy Chadwick wrote:
> The tool is behaving how it should.  Try using the -a flag.

Ok, I feel dumb now :)

Thanks,

-Clint

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Derek Kulinski-2
In reply to this post by Jeremy Chadwick-3
Hello Jeremy,

Sunday, September 21, 2008, 3:07:20 PM, you wrote:

> Consider using background_fsck="no" in /etc/rc.conf if you prefer the
> old behaviour.  Otherwise, boot single-user then do the fsck.

Actually what's the advantage of having fsck run in background if it
isn't capable of fixing things?
Isn't it more dangerous to be it like that? i.e. administrator might
not notice the problem; also filesystem could break even further...

--
Best regards,
 Derek                            mailto:[hidden email]

I tried to daydream, but my mind kept wandering.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
On Fri, Sep 26, 2008 at 09:33:41PM -0700, Derek Kuli??ski wrote:

> Hello Jeremy,
>
> Sunday, September 21, 2008, 3:07:20 PM, you wrote:
>
> > Consider using background_fsck="no" in /etc/rc.conf if you prefer the
> > old behaviour.  Otherwise, boot single-user then do the fsck.
>
> Actually what's the advantage of having fsck run in background if it
> isn't capable of fixing things?
> Isn't it more dangerous to be it like that? i.e. administrator might
> not notice the problem; also filesystem could break even further...

This question should really be directed at a set of different folks,
e.g. actual developers of said stuff (UFS2 and soft updates in
specific), because it's opening up a can of worms.

I believe it has to do with the fact that there is much faith given to
UFS2 soft updates -- the ability to background fsck allows the user to
boot their system and have it up and working (able to log in, etc.) in a
much shorter amount of time[1].  It makes the assumption that "everything
will work just fine", which is faulty.

It also gives the impression of a journalled filesystem, which UFS2 soft
updates are not.  gjournal(8) on the other hand, is, and doesn't require
fsck at all[2].

I also think this further adds fuel to the "so why are we enabling soft
updates by default and using UFS2 as a filesystem again?" fire.  I'm
sure someone will respond to this with "So use ZFS and shut up".  *sigh*

[1]: http://lists.freebsd.org/pipermail/freebsd-questions/2004-December/069114.html
[2]: http://lists.freebsd.org/pipermail/freebsd-questions/2008-April/173501.html

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Derek Kulinski-2
Hello Jeremy,

Friday, September 26, 2008, 10:14:13 PM, you wrote:

>> Actually what's the advantage of having fsck run in background if it
>> isn't capable of fixing things?
>> Isn't it more dangerous to be it like that? i.e. administrator might
>> not notice the problem; also filesystem could break even further...

> This question should really be directed at a set of different folks,
> e.g. actual developers of said stuff (UFS2 and soft updates in
> specific), because it's opening up a can of worms.

> I believe it has to do with the fact that there is much faith given to
> UFS2 soft updates -- the ability to background fsck allows the user to
> boot their system and have it up and working (able to log in, etc.) in a
> much shorter amount of time[1].  It makes the assumption that "everything
> will work just fine", which is faulty.

As far as I know (at least ideally, when write caching is disabled)
the data should always be consistent, and all fsck supposed to be
doing is to free unreferenced blocks that were allocated.
Wouldn't be possible for background fsck to do that while the
filesystem is mounted, and if there's some unrepairable error, that
somehow happen (while in theory it should be impossible) just
periodically scream on the emergency log level?

> It also gives the impression of a journalled filesystem, which UFS2 soft
> updates are not.  gjournal(8) on the other hand, is, and doesn't require
> fsck at all[2].

> I also think this further adds fuel to the "so why are we enabling soft
> updates by default and using UFS2 as a filesystem again?" fire.  I'm
> sure someone will respond to this with "So use ZFS and shut up".  *sigh*

I think the reason for using Soft Updates by default is that it was
a pretty hard thing to implement, and (at least in theory it supposed
by as reliable as journaling.

Also, if I remember correctly, PJD said that gjournal is performing
much better with small files, while softupdates is faster with big
ones.

--
Best regards,
 Derek                            mailto:[hidden email]

Programmers are tools for converting caffeine into code.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
On Fri, Sep 26, 2008 at 10:35:57PM -0700, Derek Kuli??ski wrote:

> Hello Jeremy,
>
> Friday, September 26, 2008, 10:14:13 PM, you wrote:
>
> >> Actually what's the advantage of having fsck run in background if it
> >> isn't capable of fixing things?
> >> Isn't it more dangerous to be it like that? i.e. administrator might
> >> not notice the problem; also filesystem could break even further...
>
> > This question should really be directed at a set of different folks,
> > e.g. actual developers of said stuff (UFS2 and soft updates in
> > specific), because it's opening up a can of worms.
>
> > I believe it has to do with the fact that there is much faith given to
> > UFS2 soft updates -- the ability to background fsck allows the user to
> > boot their system and have it up and working (able to log in, etc.) in a
> > much shorter amount of time[1].  It makes the assumption that "everything
> > will work just fine", which is faulty.
>
> As far as I know (at least ideally, when write caching is disabled)

Re: write caching: wheelies and burn-outs in empty parking lots
detected.

Let's be realistic.  We're talking about ATA and SATA hard disks, hooked
up to on-board controllers -- these are the majority of users.  Those
with ATA/SATA RAID controllers (not on-board RAID either; most/all of
those do not let you disable drive write caching) *might* have a RAID
BIOS menu item for disabling said feature.

FreeBSD atacontrol does not let you toggle such features (although "cap"
will show you if feature is available and if it's enabled or not).

Users using SCSI will most definitely have the ability to disable
said feature (either via SCSI BIOS or via camcontrol).  But the majority
of users are not using SCSI disks, because the majority of users are not
going to spend hundreds of dollars on a controller followed by hundreds
of dollars for a small (~74GB) disk.

Regardless of all of this, end-users should, in no way shape or form,
be expected to go to great lengths to disable their disk's write cache.
They will not, I can assure you.  Thus, we must assume: write caching
on a disk will be enabled, period.  If a filesystem is engineered with
that fact ignored, then the filesystem is either 1) worthless, or 2)
serves a very niche purpose and should not be the default filesystem.

Do we agree?

> the data should always be consistent, and all fsck supposed to be
> doing is to free unreferenced blocks that were allocated.

fsck does a heck of a lot more than that, and there's no guarantee
that's all fsck is going to do on a UFS2+SU filesystem.  I'm under the
impression it does a lot more than just looking for unref'd blocks.

> Wouldn't be possible for background fsck to do that while the
> filesystem is mounted, and if there's some unrepairable error, that
> somehow happen (while in theory it should be impossible) just
> periodically scream on the emergency log level?

The system is already up and the filesystems mounted.  If the error in
question is of such severity that it would impact a user's ability to
reliably use the filesystem, how do you expect constant screaming on
the console will help?  A user won't know what it means; there is
already evidence of this happening (re: mysterious ATA DMA errors which
still cannot be figured out[6]).

IMHO, a dirty filesystem should not be mounted until it's been fully
analysed/scanned by fsck.  So again, people are putting faith into
UFS2+SU despite actual evidence proving that it doesn't handle all
scenarios.

> > It also gives the impression of a journalled filesystem, which UFS2 soft
> > updates are not.  gjournal(8) on the other hand, is, and doesn't require
> > fsck at all[2].
>
> > I also think this further adds fuel to the "so why are we enabling soft
> > updates by default and using UFS2 as a filesystem again?" fire.  I'm
> > sure someone will respond to this with "So use ZFS and shut up".  *sigh*
>
> I think the reason for using Soft Updates by default is that it was
> a pretty hard thing to implement, and (at least in theory it supposed
> by as reliable as journaling.

The problem here is that when it was created, it was sort of an
"experiment".  Now, when someone installs FreeBSD, UFS2 is the default
filesystem used, and SU are enabled on every filesystem except the root
fs.  Thus, we have now put ourselves into a situation where said
feature ***must*** be reliable in all cases.

You're also forgetting a huge focus of SU -- snapshots[1].  However, there
are more than enough facts on the table at this point concluding that
snapshots are causing more problems[7] than previously expected.  And
there's further evidence filesystem snapshots shouldn't even be used in
this way[8].

> Also, if I remember correctly, PJD said that gjournal is performing
> much better with small files, while softupdates is faster with big
> ones.

Okay, so now we want to talk about benchmarks.  The benchmarks you're
talking about are in two places[2][3].

The benchmarks pjd@ provided were very basic/simple, which I feel is
good, because the tests were realistic (common tasks people will do).
The benchmarks mckusick@ provided for UFS2+SU were based on SCSI
disks, which is... interesting to say the least.

Bruce Evans responded with some more data[4].

I particularly enjoy this quote in his benchmark: "I never found the
exact cause of the slower readback ...", followed by (plausible)
speculations as to why that is.

I'm sorry that I sound like such a hard-ass on this matter, but there is
a glaring fact that people seem to be overlooking intentionally:

Filesystems have to be reliable; data integrity is focus #1, and cannot
be sacrificed.  Users and administrators *expect* a filesystem to be
reliable.  No one is going to keep using a filesystem if it has
disadvantages which can result in data loss or "waste of administrative
time" (which I believe is what's occurring here).

Users *will* switch to another operating system that has filesystems
which were not engineered/invented with these features in mind.  Or,
they can switch to another filesystem assuming the OS offers one which
performs equally as good/well and is guaranteed to be reliable --
and that's assuming the user wants to spend the time to reformat and
reinstall just to get that.

In the case of "bit rot" (e.g. drive cache going bad silently, bad
cables, or other forms of low-level data corruption), a filesystem is
likely not to be able to cope with this (but see below).

A common rebuttal here would be: "so use UFS2 without soft updates".
Excellent advice!  I might consider it myself!  But the problem is that
we cannot expect users to do that.  Why?  Because the defaults chosen
during sysinstall are to use SU for all filesystems except root.  If SU
is not reliable (or is "reliable in most cases" -- same thing if you ask
me), then it should not be enabled by default.  I think we (FreeBSD)
might have been a bit hasty in deciding to choose that as a default.

Next: a system locking up (or a kernel panic) should result in a dirty
filesystem.  That filesystem should be *fully recoverable* from that
kind of error, with no risk of data loss (but see below).

(There is the obvious case where a file is written to the disk, and the
disk has not completed writing the data from its internal cache to the
disk itself (re: write caching); if power is lost, the disk may not have
finished writing the cache to disk.  In this case, the file is going to
be sparse -- there is absolutely nothing that can be done about this
with any filesystem, including ZFS (to my knowledge).  This situation
is acceptable; nature of the beast.)

The filesystem should be fully analysed and any errors repaired (either
with user interaction or automatically -- I'm sure it depends on the
kind of error) **before** the filesystem is mounted.

This is where SU gets in the way.  The filesystem is mounted and the
system is brought up + online 60 seconds before the fsck starts.  The
assumption made is that the errors in question will be fully recoverable
by an automatic fsck, which as this thread proves, is not always the
case.
 
ZFS is the first filesystem, to my knowledge, which provides 1) a
reliable filesystem, 2) detection of filesystem problems in real-time or
during scrubbing, 3) repair of problems in real-time (assuming raidz1 or
raidz2 are used), and 4) does not need fsck.  This makes ZFS powerful.

"So use ZFS!"  A good piece of advice -- however, I've already had
reports from users that they will not consider ZFS for FreeBSD at this
time.  Why?  Because ZFS on FreeBSD can panic the system easily due to
kmem exhaustion.  Proper tuning can alleviate this problem, but users do
not want to to have to "tune" their system to get stability (and I feel
this is a very legitimate argument).

Additionally, FreeBSD doesn't offer ZFS as a filesystem during
installation.  PC-BSD does, AFAIK.  So on FreeBSD, you have to go
through a bunch of rigmarole[5] to get it to work (and doing this
after-the-fact is a real pain in the rear -- believe me, I did it this
weekend.)

So until both of these ZFS-oriented issues can be dealt with, some
users aren't considering it.

This is the reality of the situation.  I don't think what users and
administrators want is unreasonable; they may be rough demands, but
that's how things are in this day and age.

Have I provided enough evidence?  :-)

[1]: http://www.usenix.org/publications/library/proceedings/bsdcon02/mckusick/mckusick_html/index.html
[2]: http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064043.html
[3]: http://www.usenix.org/publications/library/proceedings/usenix2000/general/full_papers/seltzer/seltzer_html/index.html
[4]: http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064166.html
[5]: http://wiki.freebsd.org/JeremyChadwick/FreeBSD_7.x_on_a_ZFS_pool
[6]: http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting
[7]: http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues
[8]: http://lists.freebsd.org/pipermail/freebsd-stable/2007-January/032070.html

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Peter Jeremy
On 2008-Sep-26 23:44:17 -0700, Jeremy Chadwick <[hidden email]> wrote:
>On Fri, Sep 26, 2008 at 10:35:57PM -0700, Derek Kuli??ski wrote:
>> As far as I know (at least ideally, when write caching is disabled)
...
>FreeBSD atacontrol does not let you toggle such features (although "cap"
>will show you if feature is available and if it's enabled or not).

True but it can be disabled via the loader tunable hw.ata.wc (at
least in theory - apparently some drives don't obey the cache disable
command to make them look better in benchmarks).

>Users using SCSI will most definitely have the ability to disable
>said feature (either via SCSI BIOS or via camcontrol).

Soft-updates plus write caching isn't an issue with tagged queueing
(which is standard for SCSI) because the critical point for
soft-updates is knowing when the data is written to non-volatile
storage - which tagged queuing provides.

--
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.

attachment0 (202 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Derek Kulinski-2
In reply to this post by Jeremy Chadwick-3
Hello Jeremy,

Friday, September 26, 2008, 11:44:17 PM, you wrote:

>> As far as I know (at least ideally, when write caching is disabled)

> Re: write caching: wheelies and burn-outs in empty parking lots
> detected.

> Let's be realistic.  We're talking about ATA and SATA hard disks, hooked
> up to on-board controllers -- these are the majority of users.  Those
> with ATA/SATA RAID controllers (not on-board RAID either; most/all of
> those do not let you disable drive write caching) *might* have a RAID
> BIOS menu item for disabling said feature.

> FreeBSD atacontrol does not let you toggle such features (although "cap"
> will show you if feature is available and if it's enabled or not).

> Users using SCSI will most definitely have the ability to disable
> said feature (either via SCSI BIOS or via camcontrol).  But the majority
> of users are not using SCSI disks, because the majority of users are not
> going to spend hundreds of dollars on a controller followed by hundreds
> of dollars for a small (~74GB) disk.

> Regardless of all of this, end-users should, in no way shape or form,
> be expected to go to great lengths to disable their disk's write cache.
> They will not, I can assure you.  Thus, we must assume: write caching
> on a disk will be enabled, period.  If a filesystem is engineered with
> that fact ignored, then the filesystem is either 1) worthless, or 2)
> serves a very niche purpose and should not be the default filesystem.

> Do we agree?

Yes, but...

In the link you sent to me, someone mentioned that write cache is
always creates problem, and it doesn't matter on OS or filesystem.

There's more below.

>> the data should always be consistent, and all fsck supposed to be
>> doing is to free unreferenced blocks that were allocated.
> fsck does a heck of a lot more than that, and there's no guarantee
> that's all fsck is going to do on a UFS2+SU filesystem.  I'm under the
> impression it does a lot more than just looking for unref'd blocks.

Yes, fsck does a lot more than that. But the whole point of soft
updates is to reduce the work of fsck to deallocate allocated blocks.

Anyway, maybe my information are invalid, though funny thing is that
Soft Updates was mentioned in one of my lecture on Operating Systems.

Apparently the goal of Soft Updates is to always enforce those rules
in very efficient manner, by reordering the writes:
1. Never point to a data structure before initializing it
2. Never reuse a structure before nullifying pointers to it
3. Never reset last pointer to live structure before setting a new one
4. Always mark free-block bitmap entries as used before making the
   directory entry point to it

The problem comes with disks which for performance reasons cache the
data and then write it in different order back to the disk.
I think that's the reason why it's recommended to disable it.
If a disk is reordering the writes, it renders the soft updates
useless.

But if the writing order is preserved, all data remains always
consistent, the only thing that might appear are blocks that were
marked as being used, but nothing was pointing to them yet.

So (in ideal situation, when nothing interferes) all fsck needs to do
is just to scan the filesystem and deallocate those blocks.

> The system is already up and the filesystems mounted.  If the error in
> question is of such severity that it would impact a user's ability to
> reliably use the filesystem, how do you expect constant screaming on
> the console will help?  A user won't know what it means; there is
> already evidence of this happening (re: mysterious ATA DMA errors which
> still cannot be figured out[6]).

> IMHO, a dirty filesystem should not be mounted until it's been fully
> analysed/scanned by fsck.  So again, people are putting faith into
> UFS2+SU despite actual evidence proving that it doesn't handle all
> scenarios.

Yes, I think the background fsck should be disabled by default, with a
possibility to enable it if the user is sure that nothing will
interfere with soft updates.

> The problem here is that when it was created, it was sort of an
> "experiment".  Now, when someone installs FreeBSD, UFS2 is the default
> filesystem used, and SU are enabled on every filesystem except the root
> fs.  Thus, we have now put ourselves into a situation where said
> feature ***must*** be reliable in all cases.

I think in worst case it just is as realiable as if it wouldn't be
enabled (the only danger is the background fsck)

> You're also forgetting a huge focus of SU -- snapshots[1].  However, there
> are more than enough facts on the table at this point concluding that
> snapshots are causing more problems[7] than previously expected.  And
> there's further evidence filesystem snapshots shouldn't even be used in
> this way[8].

there's not much to argue about that.

>> Also, if I remember correctly, PJD said that gjournal is performing
>> much better with small files, while softupdates is faster with big
>> ones.

> Okay, so now we want to talk about benchmarks.  The benchmarks you're
> talking about are in two places[2][3].

> The benchmarks pjd@ provided were very basic/simple, which I feel is
> good, because the tests were realistic (common tasks people will do).
> The benchmarks mckusick@ provided for UFS2+SU were based on SCSI
> disks, which is... interesting to say the least.

> Bruce Evans responded with some more data[4].

> I particularly enjoy this quote in his benchmark: "I never found the
> exact cause of the slower readback ...", followed by (plausible)
> speculations as to why that is.

> I'm sorry that I sound like such a hard-ass on this matter, but there is
> a glaring fact that people seem to be overlooking intentionally:

> Filesystems have to be reliable; data integrity is focus #1, and cannot
> be sacrificed.  Users and administrators *expect* a filesystem to be
> reliable.  No one is going to keep using a filesystem if it has
> disadvantages which can result in data loss or "waste of administrative
> time" (which I believe is what's occurring here).

> Users *will* switch to another operating system that has filesystems
> which were not engineered/invented with these features in mind.  Or,
> they can switch to another filesystem assuming the OS offers one which
> performs equally as good/well and is guaranteed to be reliable --
> and that's assuming the user wants to spend the time to reformat and
> reinstall just to get that.

I wasn't trying to argue about that. Perhaps my assumption is wrong,
but I belive that the problems that we know about Soft Updates, at
worst case make system as reliable as it was without using it.

> In the case of "bit rot" (e.g. drive cache going bad silently, bad
> cables, or other forms of low-level data corruption), a filesystem is
> likely not to be able to cope with this (but see below).

> A common rebuttal here would be: "so use UFS2 without soft updates".
> Excellent advice!  I might consider it myself!  But the problem is that
> we cannot expect users to do that.  Why?  Because the defaults chosen
> during sysinstall are to use SU for all filesystems except root.  If SU
> is not reliable (or is "reliable in most cases" -- same thing if you ask
> me), then it should not be enabled by default.  I think we (FreeBSD)
> might have been a bit hasty in deciding to choose that as a default.

> Next: a system locking up (or a kernel panic) should result in a dirty
> filesystem.  That filesystem should be *fully recoverable* from that
> kind of error, with no risk of data loss (but see below).

> (There is the obvious case where a file is written to the disk, and the
> disk has not completed writing the data from its internal cache to the
> disk itself (re: write caching); if power is lost, the disk may not have
> finished writing the cache to disk.  In this case, the file is going to
> be sparse -- there is absolutely nothing that can be done about this
> with any filesystem, including ZFS (to my knowledge).  This situation
> is acceptable; nature of the beast.)

> The filesystem should be fully analysed and any errors repaired (either
> with user interaction or automatically -- I'm sure it depends on the
> kind of error) **before** the filesystem is mounted.

> This is where SU gets in the way.  The filesystem is mounted and the
> system is brought up + online 60 seconds before the fsck starts.  The
> assumption made is that the errors in question will be fully recoverable
> by an automatic fsck, which as this thread proves, is not always the
> case.

That's why I think background fsck should be disabled by default.
Though I still don't think that soft updates hurt anything (probably
except performance)

> ZFS is the first filesystem, to my knowledge, which provides 1) a
> reliable filesystem, 2) detection of filesystem problems in real-time or
> during scrubbing, 3) repair of problems in real-time (assuming raidz1 or
> raidz2 are used), and 4) does not need fsck.  This makes ZFS powerful.

> "So use ZFS!"  A good piece of advice -- however, I've already had
> reports from users that they will not consider ZFS for FreeBSD at this
> time.  Why?  Because ZFS on FreeBSD can panic the system easily due to
> kmem exhaustion.  Proper tuning can alleviate this problem, but users do
> not want to to have to "tune" their system to get stability (and I feel
> this is a very legitimate argument).

> Additionally, FreeBSD doesn't offer ZFS as a filesystem during
> installation.  PC-BSD does, AFAIK.  So on FreeBSD, you have to go
> through a bunch of rigmarole[5] to get it to work (and doing this
> after-the-fact is a real pain in the rear -- believe me, I did it this
> weekend.)

> So until both of these ZFS-oriented issues can be dealt with, some
> users aren't considering it.

> This is the reality of the situation.  I don't think what users and
> administrators want is unreasonable; they may be rough demands, but
> that's how things are in this day and age.

> Have I provided enough evidence?  :-)

Yes, but as far as I understand it's not as bad as you think :)
I could be wrong though.

I 100% agree on disabling background fsck, but I don't think soft
updates are making the system any less reliable than it would be
without it.

Also, I'll have to play with ZFS some day :)

--
Best regards,
 Derek                            mailto:[hidden email]

It's a little-known fact that the Y1K problem caused the Dark Ages.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Erik Trulsson
In reply to this post by Jeremy Chadwick-3
On Fri, Sep 26, 2008 at 11:44:17PM -0700, Jeremy Chadwick wrote:

> On Fri, Sep 26, 2008 at 10:35:57PM -0700, Derek Kuli??ski wrote:
> > Hello Jeremy,
> >
> > Friday, September 26, 2008, 10:14:13 PM, you wrote:
> >
> > >> Actually what's the advantage of having fsck run in background if it
> > >> isn't capable of fixing things?
> > >> Isn't it more dangerous to be it like that? i.e. administrator might
> > >> not notice the problem; also filesystem could break even further...
> >
> > > This question should really be directed at a set of different folks,
> > > e.g. actual developers of said stuff (UFS2 and soft updates in
> > > specific), because it's opening up a can of worms.
> >
> > > I believe it has to do with the fact that there is much faith given to
> > > UFS2 soft updates -- the ability to background fsck allows the user to
> > > boot their system and have it up and working (able to log in, etc.) in a
> > > much shorter amount of time[1].  It makes the assumption that "everything
> > > will work just fine", which is faulty.
> >
> > As far as I know (at least ideally, when write caching is disabled)
>
> Re: write caching: wheelies and burn-outs in empty parking lots
> detected.
>
> Let's be realistic.  We're talking about ATA and SATA hard disks, hooked
> up to on-board controllers -- these are the majority of users.  Those
> with ATA/SATA RAID controllers (not on-board RAID either; most/all of
> those do not let you disable drive write caching) *might* have a RAID
> BIOS menu item for disabling said feature.
>
> FreeBSD atacontrol does not let you toggle such features (although "cap"
> will show you if feature is available and if it's enabled or not).

No, but using 'sysctl hw.ata.wc=0' will quickly and easily let you disable
write caching on all ATA/SATA devices.
This was actually the default setting briefly (back in 4.3 IIRC) but was
reverted due to the performance penalty being considered too severe.


>
> Users using SCSI will most definitely have the ability to disable
> said feature (either via SCSI BIOS or via camcontrol).  But the majority
> of users are not using SCSI disks, because the majority of users are not
> going to spend hundreds of dollars on a controller followed by hundreds
> of dollars for a small (~74GB) disk.
>
> Regardless of all of this, end-users should, in no way shape or form,
> be expected to go to great lengths to disable their disk's write cache.
> They will not, I can assure you.  Thus, we must assume: write caching
> on a disk will be enabled, period.  If a filesystem is engineered with
> that fact ignored, then the filesystem is either 1) worthless, or 2)
> serves a very niche purpose and should not be the default filesystem.
>
> Do we agree?

Sort of, but soft updates does not technically need write caching to be
disabled. It does assume that disks will not 'lie' about if data has
actually been written to the disk or just to the disk's cache.  Many (most?)
ATA/SATA disks are unreliable in this regard which means that the guarantees
Soft Updates normally give about consistency of the file system can no
longer be guaranteed.



Using UFS2+soft updates on standard ATA/SATA disks (with write caching
enabled) connected to a standard disk controller is not a problem (not any
more than any other file system anyway.)

Using background fsck together with the above setup is not recommended
however.  Background fsck will only handle a subset of the errors that a
standard foreground fsck can handle.  In particular it assumes that the soft
updates guarantees of consistency are in place which would mean that there
are only a few non-critical problems that could happen.  With the above
setup those guarantees are not in place, which means that background fsck
can encounter errors it cannot (and will not) fix.






--
<Insert your favourite quote here.>
Erik Trulsson
[hidden email]
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

sthaug
In reply to this post by Derek Kulinski-2
> > IMHO, a dirty filesystem should not be mounted until it's been fully
> > analysed/scanned by fsck.  So again, people are putting faith into
> > UFS2+SU despite actual evidence proving that it doesn't handle all
> > scenarios.
>
> Yes, I think the background fsck should be disabled by default, with a
> possibility to enable it if the user is sure that nothing will
> interfere with soft updates.

Having been bitten by problems in this area more than once, I now always
disable background fsck. Having it disabled by default has my vote too.

Steinar Haug, Nethelp consulting, [hidden email]
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Jeremy Chadwick-3
In reply to this post by Derek Kulinski-2
On Sat, Sep 27, 2008 at 12:37:50AM -0700, Derek Kuli??ski wrote:

> Friday, September 26, 2008, 11:44:17 PM, you wrote:
>
> >> As far as I know (at least ideally, when write caching is disabled)
>
> > Re: write caching: wheelies and burn-outs in empty parking lots
> > detected.
>
> > Let's be realistic.  We're talking about ATA and SATA hard disks, hooked
> > up to on-board controllers -- these are the majority of users.  Those
> > with ATA/SATA RAID controllers (not on-board RAID either; most/all of
> > those do not let you disable drive write caching) *might* have a RAID
> > BIOS menu item for disabling said feature.
>
> > FreeBSD atacontrol does not let you toggle such features (although "cap"
> > will show you if feature is available and if it's enabled or not).
>
> > Users using SCSI will most definitely have the ability to disable
> > said feature (either via SCSI BIOS or via camcontrol).  But the majority
> > of users are not using SCSI disks, because the majority of users are not
> > going to spend hundreds of dollars on a controller followed by hundreds
> > of dollars for a small (~74GB) disk.
>
> > Regardless of all of this, end-users should, in no way shape or form,
> > be expected to go to great lengths to disable their disk's write cache.
> > They will not, I can assure you.  Thus, we must assume: write caching
> > on a disk will be enabled, period.  If a filesystem is engineered with
> > that fact ignored, then the filesystem is either 1) worthless, or 2)
> > serves a very niche purpose and should not be the default filesystem.
>
> > Do we agree?
>
> Yes, but...
>
> In the link you sent to me, someone mentioned that write cache is
> always creates problem, and it doesn't matter on OS or filesystem.
>
> There's more below.
>
> >> the data should always be consistent, and all fsck supposed to be
> >> doing is to free unreferenced blocks that were allocated.
> > fsck does a heck of a lot more than that, and there's no guarantee
> > that's all fsck is going to do on a UFS2+SU filesystem.  I'm under the
> > impression it does a lot more than just looking for unref'd blocks.
>
> Yes, fsck does a lot more than that. But the whole point of soft
> updates is to reduce the work of fsck to deallocate allocated blocks.
>
> Anyway, maybe my information are invalid, though funny thing is that
> Soft Updates was mentioned in one of my lecture on Operating Systems.
>
> Apparently the goal of Soft Updates is to always enforce those rules
> in very efficient manner, by reordering the writes:
> 1. Never point to a data structure before initializing it
> 2. Never reuse a structure before nullifying pointers to it
> 3. Never reset last pointer to live structure before setting a new one
> 4. Always mark free-block bitmap entries as used before making the
>    directory entry point to it
>
> The problem comes with disks which for performance reasons cache the
> data and then write it in different order back to the disk.
> I think that's the reason why it's recommended to disable it.
> If a disk is reordering the writes, it renders the soft updates
> useless.
>
> But if the writing order is preserved, all data remains always
> consistent, the only thing that might appear are blocks that were
> marked as being used, but nothing was pointing to them yet.
>
> So (in ideal situation, when nothing interferes) all fsck needs to do
> is just to scan the filesystem and deallocate those blocks.
>
> > The system is already up and the filesystems mounted.  If the error in
> > question is of such severity that it would impact a user's ability to
> > reliably use the filesystem, how do you expect constant screaming on
> > the console will help?  A user won't know what it means; there is
> > already evidence of this happening (re: mysterious ATA DMA errors which
> > still cannot be figured out[6]).
>
> > IMHO, a dirty filesystem should not be mounted until it's been fully
> > analysed/scanned by fsck.  So again, people are putting faith into
> > UFS2+SU despite actual evidence proving that it doesn't handle all
> > scenarios.
>
> Yes, I think the background fsck should be disabled by default, with a
> possibility to enable it if the user is sure that nothing will
> interfere with soft updates.
>
> > The problem here is that when it was created, it was sort of an
> > "experiment".  Now, when someone installs FreeBSD, UFS2 is the default
> > filesystem used, and SU are enabled on every filesystem except the root
> > fs.  Thus, we have now put ourselves into a situation where said
> > feature ***must*** be reliable in all cases.
>
> I think in worst case it just is as realiable as if it wouldn't be
> enabled (the only danger is the background fsck)
>
> > You're also forgetting a huge focus of SU -- snapshots[1].  However, there
> > are more than enough facts on the table at this point concluding that
> > snapshots are causing more problems[7] than previously expected.  And
> > there's further evidence filesystem snapshots shouldn't even be used in
> > this way[8].
>
> there's not much to argue about that.
>
> >> Also, if I remember correctly, PJD said that gjournal is performing
> >> much better with small files, while softupdates is faster with big
> >> ones.
>
> > Okay, so now we want to talk about benchmarks.  The benchmarks you're
> > talking about are in two places[2][3].
>
> > The benchmarks pjd@ provided were very basic/simple, which I feel is
> > good, because the tests were realistic (common tasks people will do).
> > The benchmarks mckusick@ provided for UFS2+SU were based on SCSI
> > disks, which is... interesting to say the least.
>
> > Bruce Evans responded with some more data[4].
>
> > I particularly enjoy this quote in his benchmark: "I never found the
> > exact cause of the slower readback ...", followed by (plausible)
> > speculations as to why that is.
>
> > I'm sorry that I sound like such a hard-ass on this matter, but there is
> > a glaring fact that people seem to be overlooking intentionally:
>
> > Filesystems have to be reliable; data integrity is focus #1, and cannot
> > be sacrificed.  Users and administrators *expect* a filesystem to be
> > reliable.  No one is going to keep using a filesystem if it has
> > disadvantages which can result in data loss or "waste of administrative
> > time" (which I believe is what's occurring here).
>
> > Users *will* switch to another operating system that has filesystems
> > which were not engineered/invented with these features in mind.  Or,
> > they can switch to another filesystem assuming the OS offers one which
> > performs equally as good/well and is guaranteed to be reliable --
> > and that's assuming the user wants to spend the time to reformat and
> > reinstall just to get that.
>
> I wasn't trying to argue about that. Perhaps my assumption is wrong,
> but I belive that the problems that we know about Soft Updates, at
> worst case make system as reliable as it was without using it.
>
> > In the case of "bit rot" (e.g. drive cache going bad silently, bad
> > cables, or other forms of low-level data corruption), a filesystem is
> > likely not to be able to cope with this (but see below).
>
> > A common rebuttal here would be: "so use UFS2 without soft updates".
> > Excellent advice!  I might consider it myself!  But the problem is that
> > we cannot expect users to do that.  Why?  Because the defaults chosen
> > during sysinstall are to use SU for all filesystems except root.  If SU
> > is not reliable (or is "reliable in most cases" -- same thing if you ask
> > me), then it should not be enabled by default.  I think we (FreeBSD)
> > might have been a bit hasty in deciding to choose that as a default.
>
> > Next: a system locking up (or a kernel panic) should result in a dirty
> > filesystem.  That filesystem should be *fully recoverable* from that
> > kind of error, with no risk of data loss (but see below).
>
> > (There is the obvious case where a file is written to the disk, and the
> > disk has not completed writing the data from its internal cache to the
> > disk itself (re: write caching); if power is lost, the disk may not have
> > finished writing the cache to disk.  In this case, the file is going to
> > be sparse -- there is absolutely nothing that can be done about this
> > with any filesystem, including ZFS (to my knowledge).  This situation
> > is acceptable; nature of the beast.)
>
> > The filesystem should be fully analysed and any errors repaired (either
> > with user interaction or automatically -- I'm sure it depends on the
> > kind of error) **before** the filesystem is mounted.
>
> > This is where SU gets in the way.  The filesystem is mounted and the
> > system is brought up + online 60 seconds before the fsck starts.  The
> > assumption made is that the errors in question will be fully recoverable
> > by an automatic fsck, which as this thread proves, is not always the
> > case.
>
> That's why I think background fsck should be disabled by default.
> Though I still don't think that soft updates hurt anything (probably
> except performance)
>
> > ZFS is the first filesystem, to my knowledge, which provides 1) a
> > reliable filesystem, 2) detection of filesystem problems in real-time or
> > during scrubbing, 3) repair of problems in real-time (assuming raidz1 or
> > raidz2 are used), and 4) does not need fsck.  This makes ZFS powerful.
>
> > "So use ZFS!"  A good piece of advice -- however, I've already had
> > reports from users that they will not consider ZFS for FreeBSD at this
> > time.  Why?  Because ZFS on FreeBSD can panic the system easily due to
> > kmem exhaustion.  Proper tuning can alleviate this problem, but users do
> > not want to to have to "tune" their system to get stability (and I feel
> > this is a very legitimate argument).
>
> > Additionally, FreeBSD doesn't offer ZFS as a filesystem during
> > installation.  PC-BSD does, AFAIK.  So on FreeBSD, you have to go
> > through a bunch of rigmarole[5] to get it to work (and doing this
> > after-the-fact is a real pain in the rear -- believe me, I did it this
> > weekend.)
>
> > So until both of these ZFS-oriented issues can be dealt with, some
> > users aren't considering it.
>
> > This is the reality of the situation.  I don't think what users and
> > administrators want is unreasonable; they may be rough demands, but
> > that's how things are in this day and age.
>
> > Have I provided enough evidence?  :-)
>
> Yes, but as far as I understand it's not as bad as you think :)
> I could be wrong though.
>
> I 100% agree on disabling background fsck, but I don't think soft
> updates are making the system any less reliable than it would be
> without it.

With regards to all you've said:

Thank you for these insights.  Everything you and Erik have said has
been quite educational, and I greatly appreciate it.  Always good to
learn from people who know more!  :-)

I believe we're in overall agreement with regards to background_fsck
(should be disabled by default).  I'd file a PR for this sort of thing,
but it almost seems like something that should go to the (private)
developers list for discussion first.

--
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Oliver Fromme
In reply to this post by sthaug
[hidden email] wrote:
 > [...]
 > > > IMHO, a dirty filesystem should not be mounted until it's been fully
 > > > analysed/scanned by fsck.  So again, people are putting faith into
 > > > UFS2+SU despite actual evidence proving that it doesn't handle all
 > > > scenarios.
 > >
 > > Yes, I think the background fsck should be disabled by default, with a
 > > possibility to enable it if the user is sure that nothing will
 > > interfere with soft updates.
 >
 > Having been bitten by problems in this area more than once, I now always
 > disable background fsck. Having it disabled by default has my vote too.

Just a "me too" here.

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor, and when was the
last time you needed one?"
        -- Tom Cargil, C++ Journal
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY

Michel Talon
In reply to this post by Clint O
Jeremy Chadwick wrote:

> I believe we're in overall agreement with regards to background_fsck
> (should be disabled by default).

In fact background fsck has been introduced for a good reason:
waiting for a full fsck on modern big disks is far too long.
Similarly write cache is enabled on ata disks for the reason that
without it performance sucks too much. My humble opinion is that you
attach far far too much importance to reliability in this game.
There are many reasons why corruption may happen in the files, most
of them being hardware related (bad ram, overheating chipset, etc.)
Hence you can never be assured that your data is perfectly reliable
(except perhaps ZFS permanent checksumming), all you have is some
probability of reliability. I think that for most people what is
important is a good balance between the risk of catastrophic failure
(which is always here, and is increased little by background fsck)
and the performance and ease of use. The FreeBSD developers have
chosen this middle ground, with good reason, in my opinion. People
who are more concerned with the reliability of their data, and
want to pay the price can always disable background fsck, maintain
backups, etc. Personnally i would run away from a system requiring
hours of fsck before being able to run multiuser. Neither Windows,
with NTFS, nor Linux, with ext3, reiserfs, xfs, jfs, etc. require
any form of scandisk or fsck. Demanding that full fsck is the default in
FreeBSD is akin to alienating a large fraction of users who have greener
pasture easily available. Idem for asking to disable write caching on
the disks. So for most people there is a probability to get some day
the UNEXPECTED SOFT UPDATE INCONSISTENCY message. They will run a full
fsck in that occasion, not a terrible thing. In many years of FreeBSD
use, it happened me a small number of times, and i have still to loose
a file, at least that i remarked.

--

Michel TALON

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
123
Loading...