Quantcast

TRIM support for UFS?

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

TRIM support for UFS?

Oliver Fromme
Hi,

I've bought an OCZ Vertex2 E (120 GB SSD) and installed
FreeBSD i386 stable/8 on it, using UFS (UFS2, to be exact).
I've made sure that the partitions are aligned properly,
and used newfs with 4k fragsize and 32k blocksize.
It works very well so far.

Now, the only thing missing would be TRIM support for UFS.
I did some research and found the following patch:

http://people.freebsd.org/~pjd/patches/ufsdel.patch

Apparently that patch is supposed to make UFS use TRIM,
among other things.  However, the patch seems to be rather
old (time stamp says 2007-02-03), and nothing has been
comitted to current or stable so far (I've grepped through
the past 12 months of SVN logs, after TRIM support was
added to ada(4) by mav@).

So, my question is, are there plans to add TRIM support
to UFS?  Is anyone working on it?  Or is it already there
and I just overlooked it?

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"A language that doesn't have everything is actually easier
to program in than some that do."
        -- Dennis M. Ritchie
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Pawel Jakub Dawidek
On Tue, Dec 07, 2010 at 04:31:14PM +0100, Oliver Fromme wrote:

> Hi,
>
> I've bought an OCZ Vertex2 E (120 GB SSD) and installed
> FreeBSD i386 stable/8 on it, using UFS (UFS2, to be exact).
> I've made sure that the partitions are aligned properly,
> and used newfs with 4k fragsize and 32k blocksize.
> It works very well so far.
>
> Now, the only thing missing would be TRIM support for UFS.
> I did some research and found the following patch:
>
> http://people.freebsd.org/~pjd/patches/ufsdel.patch
>
> Apparently that patch is supposed to make UFS use TRIM,
> among other things.  However, the patch seems to be rather
> old (time stamp says 2007-02-03), and nothing has been
> comitted to current or stable so far (I've grepped through
> the past 12 months of SVN logs, after TRIM support was
> added to ada(4) by mav@).
>
> So, my question is, are there plans to add TRIM support
> to UFS?  Is anyone working on it?  Or is it already there
> and I just overlooked it?
I hacked up this patch mostly for Kris and md(4) memory-backed UFS, so
on file remove space can be returned to the system.

I think you should ask Kirk what to do about that, but I'm afraid my
patch can break SU - what if we TRIM, but then panic and on fsck decide
to actually use the block?

BTW. Have you actually observed any performance degradation without
TRIM? I've similar SSDs and from what I tested it somehow can handle
wear leveling internally. You can to TRIM entire disk using this simple
program below, newfs it and test it. Then fill it with random data,
newfs it again, test it and compare results.

=== Makefile:
PROG=   trim
LDADD=  -lgeom
NO_MAN=
WARNS?= 6

.include <bsd.prog.mk>

=== trim.c:
#include <err.h>
#include <stdlib.h>

#include <libgeom.h>

int
main(int argc, char *argv[])
{
        off_t mediasize;
        int fd;

        if (argc != 2)
                errx(1, "usage: %s disk", getprogname());
        if ((fd = g_open(argv[1], 1)) < 0)
                err(2, "g_open(%s)", argv[1]);
        if ((mediasize = g_mediasize(fd)) < 0)
                err(3, "g_mediasize()");
        if (g_delete(fd, 0, mediasize) < 0)
                err(4, "g_delete()");
        g_close(fd);
        exit(0);
}

--
Pawel Jakub Dawidek                       http://www.wheelsystems.com
[hidden email]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Matthew Jacob-2

> BTW. Have you actually observed any performance degradation without
> TRIM? I've similar SSDs and from what I tested it somehow can handle
> wear leveling internally. You can to TRIM entire disk using this simple
> program below, newfs it and test it. Then fill it with random data,
> newfs it again, test it and compare results.

Yes. I've seen performance degradation. It's not clear that TRIM is the
most useful in all cases, partly because it's generally a very slow
command and it's not quite clear how well the wear levelling algorithms
incorporate TRIM changes.

One thing is certainly true is that if SSD drives are for caching only,
a SECURE ERASE at each 'mount' time along with the appropriate newfs
helps substantially.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Oliver Fromme
In reply to this post by Pawel Jakub Dawidek
Pawel Jakub Dawidek wrote:
 > On Tue, Dec 07, 2010 at 04:31:14PM +0100, Oliver Fromme wrote:
 > > I've bought an OCZ Vertex2 E (120 GB SSD) and installed
 > > FreeBSD i386 stable/8 on it, using UFS (UFS2, to be exact).
 > > I've made sure that the partitions are aligned properly,
 > > and used newfs with 4k fragsize and 32k blocksize.
 > > It works very well so far.

(I should also mention that I mounted all filesystems from
the SSD with the "noatime" option, to reduce writes during
normal operation.)

 > > So, my question is, are there plans to add TRIM support
 > > to UFS?  Is anyone working on it?  Or is it already there
 > > and I just overlooked it?
 >
 > I hacked up this patch mostly for Kris and md(4) memory-backed UFS, so
 > on file remove space can be returned to the system.

I see.

 > I think you should ask Kirk what to do about that, but I'm afraid my
 > patch can break SU - what if we TRIM, but then panic and on fsck decide
 > to actually use the block?

Oh, you're right.  That could be a problem.

Maybe it would be better to write a separate tool that
performs TRIM commands on areas of the file system that
are unused for a while.

I also remember that mav@ wrote that the TRIM command is
very slow.  So, it's probably not feasible to execute it
each time some blocks are freed, because it would make the
file system much slower and nullify all advantages of the
SSD.

Just found his comment from r201139:
"I have no idea whether it is normal, but for some reason it takes 200ms
to handle any TRIM command on this drive, that was making delete extremely
slow. But TRIM command is able to accept long list of LBAs and the length of
that list seems doesn't affect it's execution time. Implemented request
clusting algorithm allowed me to rise delete rate up to reasonable numbers,
when many parallel DELETE requests running."

 > BTW. Have you actually observed any performance degradation without
 > TRIM?

Not yet.  My SSD is still very new.  It carries only the
base system (/home is on a normal 1TB disk), so not many
writes happened so far.  But as soon as I start doing more
write access (buildworld + installworld, updating ports
and so on), I expect that performance will degrade over
time.

I've also heard from several people on various mailing lists
that the performance of their SSD drives got worse after
some time.

That performance degradation is caused by so-called "static
wear leveling".  The drive will have to move the contents
of blocks that are never (or rarely) written to to other
blocks, so they can be overwritten, in order to distribute
wear equally over all blocks.  If a block is known to be
unused (which is the case when the drive is new, or after
a TRIM command), the contents don't have to be moved, so
the write operation is much faster.  I think all modern
SSD drives use static wear leveling.

Without TRIM support in the file system, a work-around is
to "newfs -E" the file system when the performance gets
too bad.  This requires a backup-restore cycle, of course,
so it's a somewhat annoying.

Another work-around is to leave some space unused, i.e.
don't use 20% at the end of the SSD for any file systems,
for example.  Since those 20% are never written to, they
are known to be unused to the SSD's firmware, so it can
use them for wear leveling.  This will postpone the
performance degradation somewhat, but it won't completely
avoid it, ultimately.  And wasting some space is not a
very satisfying solution either.

 > I've similar SSDs and from what I tested it somehow can handle
 > wear leveling internally. You can to TRIM entire disk using this simple
 > program below, newfs it and test it.

It does basically the same as "newfs -E", right?

 > Then fill it with random data, newfs it again, test it and compare
 > results.

Filling it just once will probably not have much of an
effect.  In fact, wear leveling will probably not kick
in if you just fill the whole disk, because all blocks
are used equally anyway.

The performance degradation will only start to occur
after a while (weeks or months) when some blocks are
written much more often than others.  In this situation,
(static) wear leveling will kick in and start moving
data in order to re-use seldom-written-to blocks.

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"C is quirky, flawed, and an enormous success."
        -- Dennis M. Ritchie.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Julian Elischer-5
kirk does have some trim patches
which he sent to me once..
let me look... hmmm

ah here it is..
this may or  may not be out of date.
I'll let Kirk chime in if he thinks it 's worth it..

I include the email from him as an attachment,
hopefully it wont get stripped by the list, but you should both get it..

julian

On 12/8/10 8:58 AM, Oliver Fromme wrote:

> Pawel Jakub Dawidek wrote:
>   >  On Tue, Dec 07, 2010 at 04:31:14PM +0100, Oliver Fromme wrote:
>   >  >  I've bought an OCZ Vertex2 E (120 GB SSD) and installed
>   >  >  FreeBSD i386 stable/8 on it, using UFS (UFS2, to be exact).
>   >  >  I've made sure that the partitions are aligned properly,
>   >  >  and used newfs with 4k fragsize and 32k blocksize.
>   >  >  It works very well so far.
>
> (I should also mention that I mounted all filesystems from
> the SSD with the "noatime" option, to reduce writes during
> normal operation.)
>
>   >  >  So, my question is, are there plans to add TRIM support
>   >  >  to UFS?  Is anyone working on it?  Or is it already there
>   >  >  and I just overlooked it?
>   >
>   >  I hacked up this patch mostly for Kris and md(4) memory-backed UFS, so
>   >  on file remove space can be returned to the system.
>
> I see.
>
>   >  I think you should ask Kirk what to do about that, but I'm afraid my
>   >  patch can break SU - what if we TRIM, but then panic and on fsck decide
>   >  to actually use the block?
>
> Oh, you're right.  That could be a problem.
>
> Maybe it would be better to write a separate tool that
> performs TRIM commands on areas of the file system that
> are unused for a while.
>
> I also remember that mav@ wrote that the TRIM command is
> very slow.  So, it's probably not feasible to execute it
> each time some blocks are freed, because it would make the
> file system much slower and nullify all advantages of the
> SSD.
>
> Just found his comment from r201139:
> "I have no idea whether it is normal, but for some reason it takes 200ms
> to handle any TRIM command on this drive, that was making delete extremely
> slow. But TRIM command is able to accept long list of LBAs and the length of
> that list seems doesn't affect it's execution time. Implemented request
> clusting algorithm allowed me to rise delete rate up to reasonable numbers,
> when many parallel DELETE requests running."
>
>   >  BTW. Have you actually observed any performance degradation without
>   >  TRIM?
>
> Not yet.  My SSD is still very new.  It carries only the
> base system (/home is on a normal 1TB disk), so not many
> writes happened so far.  But as soon as I start doing more
> write access (buildworld + installworld, updating ports
> and so on), I expect that performance will degrade over
> time.
>
> I've also heard from several people on various mailing lists
> that the performance of their SSD drives got worse after
> some time.
>
> That performance degradation is caused by so-called "static
> wear leveling".  The drive will have to move the contents
> of blocks that are never (or rarely) written to to other
> blocks, so they can be overwritten, in order to distribute
> wear equally over all blocks.  If a block is known to be
> unused (which is the case when the drive is new, or after
> a TRIM command), the contents don't have to be moved, so
> the write operation is much faster.  I think all modern
> SSD drives use static wear leveling.
>
> Without TRIM support in the file system, a work-around is
> to "newfs -E" the file system when the performance gets
> too bad.  This requires a backup-restore cycle, of course,
> so it's a somewhat annoying.
>
> Another work-around is to leave some space unused, i.e.
> don't use 20% at the end of the SSD for any file systems,
> for example.  Since those 20% are never written to, they
> are known to be unused to the SSD's firmware, so it can
> use them for wear leveling.  This will postpone the
> performance degradation somewhat, but it won't completely
> avoid it, ultimately.  And wasting some space is not a
> very satisfying solution either.
>
>   >  I've similar SSDs and from what I tested it somehow can handle
>   >  wear leveling internally. You can to TRIM entire disk using this simple
>   >  program below, newfs it and test it.
>
> It does basically the same as "newfs -E", right?
>
>   >  Then fill it with random data, newfs it again, test it and compare
>   >  results.
>
> Filling it just once will probably not have much of an
> effect.  In fact, wear leveling will probably not kick
> in if you just fill the whole disk, because all blocks
> are used equally anyway.
>
> The performance degradation will only start to occur
> after a while (weeks or months) when some blocks are
> written much more often than others.  In this situation,
> (static) wear leveling will kick in and start moving
> data in order to re-use seldom-written-to blocks.
>
> Best regards
>     Oliver
>

> Date: Tue, 03 Nov 2009 14:03:14 -0800

> From: Julian Elischer <[hidden email]>
> To: Kirk McKusick <[hidden email]>
> CC: Scott Long <[hidden email]>
> Subject: UFS2 and TRIM command
>
> Kirk,
>
> You mentioned at BSDCan that you had done some work with the 'trim'
> command to tell a drive that a filesystem was no longer intersted
> in a particular block.
>
> can you let us know what the state of that work was and whether you
> have done anything more on it?
>
> thanks,
>
> Julian
Enclosed is my proposed patch that I sent to Poul_henning Kamp.
He was working with flash media at the time and wanted notification
when blocks were released so he could pre-clear them so they could
be used directly when next allocated. i never heard back from him
and consequently never followed up on it.

        ~Kirk

=-=-=-=

From:    Kirk McKusick <[hidden email]>
Date:    Wed, 21 May 2008 13:19:18 -0700
To:      Poul-Henning Kamp <[hidden email]>
Subject: UFS and BIO_DELETE
X-URL: http://WWW.McKusick.COM/
Reply-To: Kirk McKusick <[hidden email]>

I enclose below my proposed patch to add BIO_DELETE to UFS
(goes at the end of ffs_blkfree). As I have no way to test
it, I am wondering if you could let me know if it works.

Also, I am thinking of only enabling it for filesystems mounted
with a new flag requesting the behavior since the geteblk is a
rather expensive call for the usual no-op case.

I did look at just allocating a `struct bio' as a local variable
and using that, but it looked like I needed to also come up with a
`producer' and/or `consumer' if I wanted to pass it to GEOM directly,
so in the end I went with this more expensive solution. If there
is an easy way to just pass a bio structure to GEOM, I would much
prefer that approach.

        ~Kirk


*** ffs_alloc.c Wed May 21 20:11:04 2008
--- ffs_alloc.c.new Wed May 21 20:10:50 2008
***************
*** 1945,1950 ****
--- 1945,1962 ----
  ACTIVECLEAR(fs, cg);
  UFS_UNLOCK(ump);
  bdwrite(bp);
+ /*
+ * Request that the block be cleared.
+ */
+ bp = geteblk(size);
+ bp->b_iocmd = BIO_DELETE;
+ bp->b_vp = devvp;
+ bp->b_blkno = fsbtodb(fs, bno);
+ bp->b_offset = dbtob(bp->b_blkno);
+ bp->b_iooffset = bp->b_offset;
+ bp->b_bcount = size;
+ BUF_KERNPROC(bp);
+ BO_STRATEGY(&devvp->v_bufobj, bp);
  }
 
  #ifdef INVARIANTS


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Matthew Jacob-2
In reply to this post by Oliver Fromme
Perhaps a better solution here is a different filesystem? Since seek
time is not an issue for SSDs, establishing large read-only errors with
persistent large write journals (a la sprite) in one (large block)
locale would be a better fit.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Konstantin Belousov
In reply to this post by Julian Elischer-5
On Wed, Dec 08, 2010 at 09:07:56AM -0800, Julian Elischer wrote:

>
> From:    Kirk McKusick <[hidden email]>
> Date:    Wed, 21 May 2008 13:19:18 -0700
> To:      Poul-Henning Kamp <[hidden email]>
> Subject: UFS and BIO_DELETE
> X-URL: http://WWW.McKusick.COM/
> Reply-To: Kirk McKusick <[hidden email]>
>
> I enclose below my proposed patch to add BIO_DELETE to UFS
> (goes at the end of ffs_blkfree). As I have no way to test
> it, I am wondering if you could let me know if it works.
>
> Also, I am thinking of only enabling it for filesystems mounted
> with a new flag requesting the behavior since the geteblk is a
> rather expensive call for the usual no-op case.
>
> I did look at just allocating a `struct bio' as a local variable
> and using that, but it looked like I needed to also come up with a
> `producer' and/or `consumer' if I wanted to pass it to GEOM directly,
> so in the end I went with this more expensive solution. If there
> is an easy way to just pass a bio structure to GEOM, I would much
> prefer that approach.
>
> ~Kirk
>
>
> *** ffs_alloc.c Wed May 21 20:11:04 2008
> --- ffs_alloc.c.new Wed May 21 20:10:50 2008
> ***************
> *** 1945,1950 ****
> --- 1945,1962 ----
>   ACTIVECLEAR(fs, cg);
>   UFS_UNLOCK(ump);
>   bdwrite(bp);
> + /*
> + * Request that the block be cleared.
> + */
> + bp = geteblk(size);
> + bp->b_iocmd = BIO_DELETE;
> + bp->b_vp = devvp;
> + bp->b_blkno = fsbtodb(fs, bno);
> + bp->b_offset = dbtob(bp->b_blkno);
> + bp->b_iooffset = bp->b_offset;
> + bp->b_bcount = size;
> + BUF_KERNPROC(bp);
> + BO_STRATEGY(&devvp->v_bufobj, bp);
>   }
>  
>   #ifdef INVARIANTS
>
The UFS devvp contains the pointer to struct g_consumer in
devpp->v_bufobj->bo_private, so g_alloc_bio() and g_io_request() can be
used to start BIO_DELETE command without fake bufer allocation.
g_io_strategy() may be used as the sample.

But, I have a question about the patch. The bitmap block in ffs_blkfree()
is handled with delayed write, by bdwrite() call. I think that it is
quite possible for BIO_DELETE to be executed before the bitmap block
is written and on-disk bit is set in bitmap. Would not this allow
for the blocks to be deleted before the bitmap is updated,
and potentially before on-disk pointers are cleared that points to
the blocks ? SU just rolls back the unfinished dependencies on write,
but BIO_DELETE cannot.

Also, shouldn't the BIO_DELETE only be issued for real devices,
and not the snapshots ?

diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
index b740bbb..4ff57ae 100644
--- a/sys/ufs/ffs/ffs_alloc.c
+++ b/sys/ufs/ffs/ffs_alloc.c
@@ -83,6 +83,8 @@ __FBSDID("$FreeBSD$");
 
 #include <security/audit/audit.h>
 
+#include <geom/geom.h>
+
 #include <ufs/ufs/dir.h>
 #include <ufs/ufs/extattr.h>
 #include <ufs/ufs/quota.h>
@@ -1850,6 +1852,7 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  u_int cg;
  u_int8_t *blksfree;
  struct cdev *dev;
+ struct bio *bip;
 
  cg = dtog(fs, bno);
  if (devvp->v_type == VREG) {
@@ -1962,6 +1965,16 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  softdep_setup_blkfree(UFSTOVFS(ump), bp, bno,
     numfrags(fs, size), dephd);
  bdwrite(bp);
+
+ if (devvp->v_type != VREG) {
+ bip = g_alloc_bio();
+ bip->bio_cmd = BIO_DELETE;
+ bip->bio_offset = dbtob(fsbtodb(fs, bno));
+ bip->bio_done = g_destroy_bio;
+ bip->bio_length = size;
+ g_io_request(bip,
+    (struct g_consumer *)devvp->v_bufobj.bo_private);
+ }
 }
 
 #ifdef INVARIANTS


attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Kirk McKusick-2
> Date: Wed, 8 Dec 2010 23:49:09 +0200
> From: Kostik Belousov <[hidden email]>
> To: Julian Elischer <[hidden email]>
> Cc: [hidden email], [hidden email],
>         Oliver Fromme <[hidden email]>,
>         Kirk McKusick <[hidden email]>
> Subject: Re: TRIM support for UFS?
>
> On Wed, Dec 08, 2010 at 09:07:56AM -0800, Julian Elischer wrote:
> >
> > From:    Kirk McKusick <[hidden email]>
> > Date:    Wed, 21 May 2008 13:19:18 -0700
> > To:      Poul-Henning Kamp <[hidden email]>
> > Subject: UFS and BIO_DELETE
> > X-URL: http://WWW.McKusick.COM/
> > Reply-To: Kirk McKusick <[hidden email]>
> >
> > I enclose below my proposed patch to add BIO_DELETE to UFS
> > (goes at the end of ffs_blkfree). As I have no way to test
> > it, I am wondering if you could let me know if it works.
> >
> > Also, I am thinking of only enabling it for filesystems mounted
> > with a new flag requesting the behavior since the geteblk is a
> > rather expensive call for the usual no-op case.
> >
> > I did look at just allocating a `struct bio' as a local variable
> > and using that, but it looked like I needed to also come up with a
> > `producer' and/or `consumer' if I wanted to pass it to GEOM directly,
> > so in the end I went with this more expensive solution. If there
> > is an easy way to just pass a bio structure to GEOM, I would much
> > prefer that approach.
> >
> > ~Kirk
> >
> >
> > *** ffs_alloc.c Wed May 21 20:11:04 2008
> > --- ffs_alloc.c.new Wed May 21 20:10:50 2008
> > ***************
> > *** 1945,1950 ****
> > --- 1945,1962 ----
> >   ACTIVECLEAR(fs, cg);
> >   UFS_UNLOCK(ump);
> >   bdwrite(bp);
> > + /*
> > + * Request that the block be cleared.
> > + */
> > + bp = geteblk(size);
> > + bp->b_iocmd = BIO_DELETE;
> > + bp->b_vp = devvp;
> > + bp->b_blkno = fsbtodb(fs, bno);
> > + bp->b_offset = dbtob(bp->b_blkno);
> > + bp->b_iooffset = bp->b_offset;
> > + bp->b_bcount = size;
> > + BUF_KERNPROC(bp);
> > + BO_STRATEGY(&devvp->v_bufobj, bp);
> >   }
> >
> >   #ifdef INVARIANTS
> >
> The UFS devvp contains the pointer to struct g_consumer in
> devpp->v_bufobj->bo_private, so g_alloc_bio() and g_io_request() can be
> used to start BIO_DELETE command without fake bufer allocation.
> g_io_strategy() may be used as the sample.

Thanks for that update to moderize the patch!

> But, I have a question about the patch. The bitmap block in ffs_blkfree()
> is handled with delayed write, by bdwrite() call. I think that it is
> quite possible for BIO_DELETE to be executed before the bitmap block
> is written and on-disk bit is set in bitmap. Would not this allow
> for the blocks to be deleted before the bitmap is updated,
> and potentially before on-disk pointers are cleared that points to
> the blocks ? SU just rolls back the unfinished dependencies on write,
> but BIO_DELETE cannot.

It is safe to do the BIO_DELETE here because at this point we have
progressed through the file delete to the point where the inode that
claimed this block has been updated on the disk with a NULL pointer.
The only step that remains is to update the on-disk bitmap to reflect
that the block is available. If the bitmap write is not done before
a system crash the only action that will be taken with respect to
the freed block is that either background fsck will restore it to
the bitmap (SU) or the journal will restore it to the bitmap (SUJ).
There is no case where either SU or SUJ will put it back into the
file once we have reached this point.

> Also, shouldn't the BIO_DELETE only be issued for real devices,
> and not the snapshots ?

They should also be issued for snapshots. The only time that snapshots
give up blocks is when they are deleted. The blocks that they give up
at that time are all copies of blocks that were later changed in the
filesystem. With the snapshot gone there will be no remaining references
to those blocks and they will be made available for reallocation. As
such, it would be helpful if they had been TRIMMED.

> diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
> index b740bbb..4ff57ae 100644
> --- a/sys/ufs/ffs/ffs_alloc.c
> +++ b/sys/ufs/ffs/ffs_alloc.c
> @@ -83,6 +83,8 @@ __FBSDID("$FreeBSD$");
>
>  #include <security/audit/audit.h>
>
> +#include <geom/geom.h>
> +
>  #include <ufs/ufs/dir.h>
>  #include <ufs/ufs/extattr.h>
>  #include <ufs/ufs/quota.h>
> @@ -1850,6 +1852,7 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
>   u_int cg;
>   u_int8_t *blksfree;
>   struct cdev *dev;
> + struct bio *bip;
>
>   cg = dtog(fs, bno);
>   if (devvp->v_type == VREG) {
> @@ -1962,6 +1965,16 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
>   softdep_setup_blkfree(UFSTOVFS(ump), bp, bno,
>      numfrags(fs, size), dephd);
>   bdwrite(bp);
> +
> + if (devvp->v_type != VREG) {
> + bip = g_alloc_bio();
> + bip->bio_cmd = BIO_DELETE;
> + bip->bio_offset = dbtob(fsbtodb(fs, bno));
> + bip->bio_done = g_destroy_bio;
> + bip->bio_length = size;
> + g_io_request(bip,
> +    (struct g_consumer *)devvp->v_bufobj.bo_private);
> + }
>  }
>
>  #ifdef INVARIANTS

The above patch looks good though I would do it unconditionally
(e.g., also for snapshots). It seems sensible to make it conditional
on a sysctl variable so that folks could experiment with it more
easily. And I would leave it off by default as non-SSD disks are
unlikely to benefit from it. If it does prove useful for SSD disks
then I would make it conditional on a filesystem flag so that it
is possible to enable it on a filesystem-by-filesystem basis (e.g.,
on for filesystems on the SSD and off for the rest).

        Kirk McKusick
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Pawel Jakub Dawidek
On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:
> The above patch looks good though I would do it unconditionally
> (e.g., also for snapshots). It seems sensible to make it conditional
> on a sysctl variable so that folks could experiment with it more
> easily. And I would leave it off by default as non-SSD disks are
> unlikely to benefit from it. If it does prove useful for SSD disks
> then I would make it conditional on a filesystem flag so that it
> is possible to enable it on a filesystem-by-filesystem basis (e.g.,
> on for filesystems on the SSD and off for the rest).

We already have a flag for disks: DISKFLAG_CANDELETE, which tells if the
disk support TRIM or not. Next we should add BIO_GETATTR attribute for
DISK class to return true if TRIM is supported. This way UFS can ask if
TRIM is supported on mount and don't bother sending BIO_DELETE if it is
not supported.

--
Pawel Jakub Dawidek                       http://www.wheelsystems.com
[hidden email]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Konstantin Belousov
In reply to this post by Kirk McKusick-2
On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:

> > Date: Wed, 8 Dec 2010 23:49:09 +0200
> > From: Kostik Belousov <[hidden email]>
> > To: Julian Elischer <[hidden email]>
> > Cc: [hidden email], [hidden email],
> >         Oliver Fromme <[hidden email]>,
> >         Kirk McKusick <[hidden email]>
> > Subject: Re: TRIM support for UFS?
> >
> > On Wed, Dec 08, 2010 at 09:07:56AM -0800, Julian Elischer wrote:
> > >
> > > From:    Kirk McKusick <[hidden email]>
> > > Date:    Wed, 21 May 2008 13:19:18 -0700
> > > To:      Poul-Henning Kamp <[hidden email]>
> > > Subject: UFS and BIO_DELETE
> > > X-URL: http://WWW.McKusick.COM/
> > > Reply-To: Kirk McKusick <[hidden email]>
> > >
> > > I enclose below my proposed patch to add BIO_DELETE to UFS
> > > (goes at the end of ffs_blkfree). As I have no way to test
> > > it, I am wondering if you could let me know if it works.
> > >
> > > Also, I am thinking of only enabling it for filesystems mounted
> > > with a new flag requesting the behavior since the geteblk is a
> > > rather expensive call for the usual no-op case.
> > >
> > > I did look at just allocating a `struct bio' as a local variable
> > > and using that, but it looked like I needed to also come up with a
> > > `producer' and/or `consumer' if I wanted to pass it to GEOM directly,
> > > so in the end I went with this more expensive solution. If there
> > > is an easy way to just pass a bio structure to GEOM, I would much
> > > prefer that approach.
> > >
> > > ~Kirk
> > >
> > >
> > > *** ffs_alloc.c Wed May 21 20:11:04 2008
> > > --- ffs_alloc.c.new Wed May 21 20:10:50 2008
> > > ***************
> > > *** 1945,1950 ****
> > > --- 1945,1962 ----
> > >   ACTIVECLEAR(fs, cg);
> > >   UFS_UNLOCK(ump);
> > >   bdwrite(bp);
> > > + /*
> > > + * Request that the block be cleared.
> > > + */
> > > + bp = geteblk(size);
> > > + bp->b_iocmd = BIO_DELETE;
> > > + bp->b_vp = devvp;
> > > + bp->b_blkno = fsbtodb(fs, bno);
> > > + bp->b_offset = dbtob(bp->b_blkno);
> > > + bp->b_iooffset = bp->b_offset;
> > > + bp->b_bcount = size;
> > > + BUF_KERNPROC(bp);
> > > + BO_STRATEGY(&devvp->v_bufobj, bp);
> > >   }
> > >
> > >   #ifdef INVARIANTS
> > >
> > The UFS devvp contains the pointer to struct g_consumer in
> > devpp->v_bufobj->bo_private, so g_alloc_bio() and g_io_request() can be
> > used to start BIO_DELETE command without fake bufer allocation.
> > g_io_strategy() may be used as the sample.
>
> Thanks for that update to moderize the patch!
>
> > But, I have a question about the patch. The bitmap block in ffs_blkfree()
> > is handled with delayed write, by bdwrite() call. I think that it is
> > quite possible for BIO_DELETE to be executed before the bitmap block
> > is written and on-disk bit is set in bitmap. Would not this allow
> > for the blocks to be deleted before the bitmap is updated,
> > and potentially before on-disk pointers are cleared that points to
> > the blocks ? SU just rolls back the unfinished dependencies on write,
> > but BIO_DELETE cannot.
>
> It is safe to do the BIO_DELETE here because at this point we have
> progressed through the file delete to the point where the inode that
> claimed this block has been updated on the disk with a NULL pointer.
> The only step that remains is to update the on-disk bitmap to reflect
> that the block is available. If the bitmap write is not done before
> a system crash the only action that will be taken with respect to
> the freed block is that either background fsck will restore it to
> the bitmap (SU) or the journal will restore it to the bitmap (SUJ).
> There is no case where either SU or SUJ will put it back into the
> file once we have reached this point.
Thanks for the explanation. On the other hand, can my scenario be real
for async mounts ?

>
> > Also, shouldn't the BIO_DELETE only be issued for real devices,
> > and not the snapshots ?
>
> They should also be issued for snapshots. The only time that snapshots
> give up blocks is when they are deleted. The blocks that they give up
> at that time are all copies of blocks that were later changed in the
> filesystem. With the snapshot gone there will be no remaining references
> to those blocks and they will be made available for reallocation. As
> such, it would be helpful if they had been TRIMMED.
>
> > diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
> > index b740bbb..4ff57ae 100644
> > --- a/sys/ufs/ffs/ffs_alloc.c
> > +++ b/sys/ufs/ffs/ffs_alloc.c
> > @@ -83,6 +83,8 @@ __FBSDID("$FreeBSD$");
> >
> >  #include <security/audit/audit.h>
> >
> > +#include <geom/geom.h>
> > +
> >  #include <ufs/ufs/dir.h>
> >  #include <ufs/ufs/extattr.h>
> >  #include <ufs/ufs/quota.h>
> > @@ -1850,6 +1852,7 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
> >   u_int cg;
> >   u_int8_t *blksfree;
> >   struct cdev *dev;
> > + struct bio *bip;
> >
> >   cg = dtog(fs, bno);
> >   if (devvp->v_type == VREG) {
> > @@ -1962,6 +1965,16 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
> >   softdep_setup_blkfree(UFSTOVFS(ump), bp, bno,
> >      numfrags(fs, size), dephd);
> >   bdwrite(bp);
> > +
> > + if (devvp->v_type != VREG) {
> > + bip = g_alloc_bio();
> > + bip->bio_cmd = BIO_DELETE;
> > + bip->bio_offset = dbtob(fsbtodb(fs, bno));
> > + bip->bio_done = g_destroy_bio;
> > + bip->bio_length = size;
> > + g_io_request(bip,
> > +    (struct g_consumer *)devvp->v_bufobj.bo_private);
> > + }
> >  }
> >
> >  #ifdef INVARIANTS
>
> The above patch looks good though I would do it unconditionally
> (e.g., also for snapshots). It seems sensible to make it conditional
> on a sysctl variable so that folks could experiment with it more
> easily. And I would leave it off by default as non-SSD disks are
> unlikely to benefit from it. If it does prove useful for SSD disks
> then I would make it conditional on a filesystem flag so that it
> is possible to enable it on a filesystem-by-filesystem basis (e.g.,
> on for filesystems on the SSD and off for the rest).
I think it is better to have a flag in superblock instead of the mount
option.

diff --git a/sbin/dumpfs/dumpfs.c b/sbin/dumpfs/dumpfs.c
index 38c05f6..fa562dc 100644
--- a/sbin/dumpfs/dumpfs.c
+++ b/sbin/dumpfs/dumpfs.c
@@ -253,9 +253,11 @@ dumpfs(const char *name)
  printf("fs_flags expanded ");
  if (fsflags & FS_NFS4ACLS)
  printf("nfsv4acls ");
+ if (fsflags & FS_TRIM)
+ printf("trim ");
  fsflags &= ~(FS_UNCLEAN | FS_DOSOFTDEP | FS_NEEDSFSCK | FS_INDEXDIRS |
      FS_ACLS | FS_MULTILABEL | FS_GJOURNAL | FS_FLAGS_UPDATED |
-     FS_NFS4ACLS | FS_SUJ);
+     FS_NFS4ACLS | FS_SUJ | FS_TRIM);
  if (fsflags != 0)
  printf("unknown flags (%#x)", fsflags);
  putchar('\n');
diff --git a/sbin/tunefs/tunefs.c b/sbin/tunefs/tunefs.c
index b2ea602..0ed713d 100644
--- a/sbin/tunefs/tunefs.c
+++ b/sbin/tunefs/tunefs.c
@@ -82,11 +82,13 @@ int
 main(int argc, char *argv[])
 {
  char *avalue, *jvalue, *Jvalue, *Lvalue, *lvalue, *Nvalue, *nvalue;
+ char *tvalue;
  const char *special, *on;
  const char *name;
  int active;
  int Aflag, aflag, eflag, evalue, fflag, fvalue, jflag, Jflag, Lflag;
  int lflag, mflag, mvalue, Nflag, nflag, oflag, ovalue, pflag, sflag;
+ int tflag;
  int svalue, Sflag, Svalue;
  int ch, found_arg, i;
  const char *chg[2];
@@ -101,7 +103,8 @@ main(int argc, char *argv[])
  evalue = fvalue = mvalue = ovalue = svalue = Svalue = 0;
  active = 0;
  found_arg = 0; /* At least one arg is required. */
- while ((ch = getopt(argc, argv, "Aa:e:f:j:J:L:l:m:N:n:o:ps:S:")) != -1)
+ while ((ch = getopt(argc, argv, "Aa:e:f:j:J:L:l:m:N:n:o:ps:S:t:"))
+    != -1)
  switch (ch) {
 
  case 'A':
@@ -268,6 +271,18 @@ main(int argc, char *argv[])
  Sflag = 1;
  break;
 
+ case 't':
+ found_arg = 1;
+ name = "trim";
+ tvalue = optarg;
+ if (strcmp(tvalue, "enable") != 0 &&
+    strcmp(tvalue, "disable") != 0) {
+ errx(10, "bad %s (options are %s)",
+    name, "`enable' or `disable'");
+ }
+ tflag = 1;
+ break;
+
  default:
  usage();
  }
@@ -493,6 +508,24 @@ main(int argc, char *argv[])
  sblock.fs_avgfpdir = svalue;
  }
  }
+ if (tflag) {
+ name = "issue TRIM to the disk";
+ if (strcmp(tvalue, "enable") == 0) {
+ if (sblock.fs_flags & FS_TRIM)
+ warnx("%s remains unchanged as enabled", name);
+ else {
+ sblock.fs_flags |= FS_TRIM;
+ warnx("%s set", name);
+ }
+ } else if (strcmp(tvalue, "disable") == 0) {
+ if ((~sblock.fs_flags & FS_TRIM) == FS_TRIM)
+ warnx("%s remains unchanged as disabled", name);
+ else {
+ sblock.fs_flags &= ~FS_TRIM;
+ warnx("%s cleared", name);
+ }
+ }
+ }
 
  if (sbwrite(&disk, Aflag) == -1)
  goto err;
@@ -1011,12 +1044,13 @@ out:
 void
 usage(void)
 {
- fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n",
+ fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n%s\n",
 "usage: tunefs [-A] [-a enable | disable] [-e maxbpg] [-f avgfilesize]",
 "              [-J enable | disable] [-j enable | disable]",
 "              [-L volname] [-l enable | disable] [-m minfree]",
 "              [-N enable | disable] [-n enable | disable]",
-"              [-o space | time] [-p] [-s avgfpdir] special | filesystem");
+"              [-o space | time] [-p] [-s avgfpdir] [-t enable | disable]",
+"              special | filesystem");
  exit(2);
 }
 
@@ -1035,6 +1069,8 @@ printfs(void)
  (sblock.fs_flags & FS_SUJ)? "enabled" : "disabled");
  warnx("gjournal: (-J)                                     %s",
  (sblock.fs_flags & FS_GJOURNAL)? "enabled" : "disabled");
+ warnx("trim: (-p)                                         %s",
+ (sblock.fs_flags & FS_TRIM)? "enabled" : "disabled");
  warnx("maximum blocks per file in a cylinder group: (-e)  %d",
       sblock.fs_maxbpg);
  warnx("average file size: (-f)                            %d",
diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
index b740bbb..ed39aa3 100644
--- a/sys/ufs/ffs/ffs_alloc.c
+++ b/sys/ufs/ffs/ffs_alloc.c
@@ -83,6 +83,8 @@ __FBSDID("$FreeBSD$");
 
 #include <security/audit/audit.h>
 
+#include <geom/geom.h>
+
 #include <ufs/ufs/dir.h>
 #include <ufs/ufs/extattr.h>
 #include <ufs/ufs/quota.h>
@@ -1850,6 +1852,7 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  u_int cg;
  u_int8_t *blksfree;
  struct cdev *dev;
+ struct bio *bip;
 
  cg = dtog(fs, bno);
  if (devvp->v_type == VREG) {
@@ -1962,6 +1965,16 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  softdep_setup_blkfree(UFSTOVFS(ump), bp, bno,
     numfrags(fs, size), dephd);
  bdwrite(bp);
+
+ if (fs->fs_flags & FS_TRIM) {
+ bip = g_alloc_bio();
+ bip->bio_cmd = BIO_DELETE;
+ bip->bio_offset = dbtob(fsbtodb(fs, bno));
+ bip->bio_done = g_destroy_bio;
+ bip->bio_length = size;
+ g_io_request(bip,
+    (struct g_consumer *)devvp->v_bufobj.bo_private);
+ }
 }
 
 #ifdef INVARIANTS
diff --git a/sys/ufs/ffs/fs.h b/sys/ufs/ffs/fs.h
index ba98ed3..13d9ede 100644
--- a/sys/ufs/ffs/fs.h
+++ b/sys/ufs/ffs/fs.h
@@ -417,6 +417,7 @@ CTASSERT(sizeof(struct fs) == 1376);
 #define FS_FLAGS_UPDATED 0x0080 /* flags have been moved to new location */
 #define FS_NFS4ACLS 0x0100 /* file system has NFSv4 ACLs enabled */
 #define FS_INDEXDIRS 0x0200 /* kernel supports indexed directories */
+#define FS_TRIM 0x0400 /* issue BIO_DELETE for deleted blocks */
 
 /*
  * Macros to access bits in the fs_active array.

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Kirk McKusick-2
> Date: Thu, 9 Dec 2010 00:53:52 +0200
> From: Kostik Belousov <[hidden email]>
> To: Kirk McKusick <[hidden email]>
> Cc: Julian Elischer <[hidden email]>, [hidden email],
>         [hidden email], Oliver Fromme <[hidden email]>
> Subject: Re: TRIM support for UFS?
>
> On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:
> >
> > It is safe to do the BIO_DELETE here because at this point we have
> > progressed through the file delete to the point where the inode that
> > claimed this block has been updated on the disk with a NULL pointer.
>
> Thanks for the explanation. On the other hand, can my scenario be real
> for async mounts ?

Async mounts are not compatible with SU or SUJ. You are running the
filesystem in a mode where there is no guarantee of recovery after
a crash. So yes, you may TRIM blocks that are still claimed by on-disk
inodes. But that is likely the least of your worries. Since async makes
no claims about filesystem consistency after a crash, adding one more
way for it to break does not seem like a big deal :-)

> > The above patch looks good though I would do it unconditionally
> > (e.g., also for snapshots). It seems sensible to make it conditional
> > on a sysctl variable so that folks could experiment with it more
> > easily. And I would leave it off by default as non-SSD disks are
> > unlikely to benefit from it. If it does prove useful for SSD disks
> > then I would make it conditional on a filesystem flag so that it
> > is possible to enable it on a filesystem-by-filesystem basis (e.g.,
> > on for filesystems on the SSD and off for the rest).
>
> I think it is better to have a flag in superblock instead of the mount
> option.
>
> Most of diff deleted...
>
> diff --git a/sys/ufs/ffs/fs.h b/sys/ufs/ffs/fs.h
> index ba98ed3..13d9ede 100644
> --- a/sys/ufs/ffs/fs.h
> +++ b/sys/ufs/ffs/fs.h
> @@ -417,6 +417,7 @@ CTASSERT(sizeof(struct fs) == 1376);
>  #define FS_FLAGS_UPDATED 0x0080 /* flags have been moved to new location */
>  #define FS_NFS4ACLS 0x0100 /* file system has NFSv4 ACLs enabled */
>  #define FS_INDEXDIRS 0x0200 /* kernel supports indexed directories */
> +#define FS_TRIM 0x0400 /* issue BIO_DELETE for deleted blocks */
>
>  /*
>   * Macros to access bits in the fs_active array.

I approve of this change and am happy to see it (finally) get added.

> On Wed, 8 Dec 2010 at 23:33:27 +0100, Pawel Jakub Dawidek wrote:
>> On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:
>> The above patch looks good though I would do it unconditionally
>> (e.g., also for snapshots). It seems sensible to make it conditional
>> on a sysctl variable so that folks could experiment with it more
>> easily. And I would leave it off by default as non-SSD disks are
>> unlikely to benefit from it. If it does prove useful for SSD disks
>> then I would make it conditional on a filesystem flag so that it
>> is possible to enable it on a filesystem-by-filesystem basis (e.g.,
>> on for filesystems on the SSD and off for the rest).
>
> We already have a flag for disks: DISKFLAG_CANDELETE, which tells if the
> disk support TRIM or not. Next we should add BIO_GETATTR attribute for
> DISK class to return true if TRIM is supported. This way UFS can ask if
> TRIM is supported on mount and don't bother sending BIO_DELETE if it is
> not supported.

Clearly asking the underlying device is the most automated way to decide
whether to do TRIM. Once it is possible to ask the underlying disk if it
supports TRIM, your change could be modified to do the querry to the disk
and set the FS_TRIM flag accordingly each time that the filesystem is
mounted. If it is easy to add a BIO_GETATTR attribute for the DISK class,
then we can skip the intermediate stage of using tunefs to enable/disable it.

        Kirk McKusick
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Konstantin Belousov
On Wed, Dec 08, 2010 at 03:56:59PM -0800, Kirk McKusick wrote:

> > Date: Thu, 9 Dec 2010 00:53:52 +0200
> > From: Kostik Belousov <[hidden email]>
> > To: Kirk McKusick <[hidden email]>
> > Cc: Julian Elischer <[hidden email]>, [hidden email],
> >         [hidden email], Oliver Fromme <[hidden email]>
> > Subject: Re: TRIM support for UFS?
> >
> > On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:
> > >
> > > It is safe to do the BIO_DELETE here because at this point we have
> > > progressed through the file delete to the point where the inode that
> > > claimed this block has been updated on the disk with a NULL pointer.
> >
> > Thanks for the explanation. On the other hand, can my scenario be real
> > for async mounts ?
>
> Async mounts are not compatible with SU or SUJ. You are running the
> filesystem in a mode where there is no guarantee of recovery after
> a crash. So yes, you may TRIM blocks that are still claimed by on-disk
> inodes. But that is likely the least of your worries. Since async makes
> no claims about filesystem consistency after a crash, adding one more
> way for it to break does not seem like a big deal :-)
>
> > > The above patch looks good though I would do it unconditionally
> > > (e.g., also for snapshots). It seems sensible to make it conditional
> > > on a sysctl variable so that folks could experiment with it more
> > > easily. And I would leave it off by default as non-SSD disks are
> > > unlikely to benefit from it. If it does prove useful for SSD disks
> > > then I would make it conditional on a filesystem flag so that it
> > > is possible to enable it on a filesystem-by-filesystem basis (e.g.,
> > > on for filesystems on the SSD and off for the rest).
> >
> > I think it is better to have a flag in superblock instead of the mount
> > option.
> >
> > Most of diff deleted...
> >
> > diff --git a/sys/ufs/ffs/fs.h b/sys/ufs/ffs/fs.h
> > index ba98ed3..13d9ede 100644
> > --- a/sys/ufs/ffs/fs.h
> > +++ b/sys/ufs/ffs/fs.h
> > @@ -417,6 +417,7 @@ CTASSERT(sizeof(struct fs) == 1376);
> >  #define FS_FLAGS_UPDATED 0x0080 /* flags have been moved to new location */
> >  #define FS_NFS4ACLS 0x0100 /* file system has NFSv4 ACLs enabled */
> >  #define FS_INDEXDIRS 0x0200 /* kernel supports indexed directories */
> > +#define FS_TRIM 0x0400 /* issue BIO_DELETE for deleted blocks */
> >
> >  /*
> >   * Macros to access bits in the fs_active array.
>
> I approve of this change and am happy to see it (finally) get added.
>
> > On Wed, 8 Dec 2010 at 23:33:27 +0100, Pawel Jakub Dawidek wrote:
> >> On Wed, Dec 08, 2010 at 02:23:52PM -0800, Kirk McKusick wrote:
> >> The above patch looks good though I would do it unconditionally
> >> (e.g., also for snapshots). It seems sensible to make it conditional
> >> on a sysctl variable so that folks could experiment with it more
> >> easily. And I would leave it off by default as non-SSD disks are
> >> unlikely to benefit from it. If it does prove useful for SSD disks
> >> then I would make it conditional on a filesystem flag so that it
> >> is possible to enable it on a filesystem-by-filesystem basis (e.g.,
> >> on for filesystems on the SSD and off for the rest).
> >
> > We already have a flag for disks: DISKFLAG_CANDELETE, which tells if the
> > disk support TRIM or not. Next we should add BIO_GETATTR attribute for
> > DISK class to return true if TRIM is supported. This way UFS can ask if
> > TRIM is supported on mount and don't bother sending BIO_DELETE if it is
> > not supported.
>
> Clearly asking the underlying device is the most automated way to decide
> whether to do TRIM. Once it is possible to ask the underlying disk if it
> supports TRIM, your change could be modified to do the querry to the disk
> and set the FS_TRIM flag accordingly each time that the filesystem is
> mounted. If it is easy to add a BIO_GETATTR attribute for the DISK class,
> then we can skip the intermediate stage of using tunefs to enable/disable it.
I still think that TRIM should be opt-in, at least for the first time.
Below is the BIO_GETATTR change integrated.

Now, both FS_TRIM should be set and device shall report the
GEOM::candelete attribute as true to have BIO_DELETE issued. geom_disk
and md(4) report GEOM::candelete. Also, I updated tunefs(8) manpage,
missed in the previous version.

diff --git a/sbin/dumpfs/dumpfs.c b/sbin/dumpfs/dumpfs.c
index 38c05f6..fa562dc 100644
--- a/sbin/dumpfs/dumpfs.c
+++ b/sbin/dumpfs/dumpfs.c
@@ -253,9 +253,11 @@ dumpfs(const char *name)
  printf("fs_flags expanded ");
  if (fsflags & FS_NFS4ACLS)
  printf("nfsv4acls ");
+ if (fsflags & FS_TRIM)
+ printf("trim ");
  fsflags &= ~(FS_UNCLEAN | FS_DOSOFTDEP | FS_NEEDSFSCK | FS_INDEXDIRS |
      FS_ACLS | FS_MULTILABEL | FS_GJOURNAL | FS_FLAGS_UPDATED |
-     FS_NFS4ACLS | FS_SUJ);
+     FS_NFS4ACLS | FS_SUJ | FS_TRIM);
  if (fsflags != 0)
  printf("unknown flags (%#x)", fsflags);
  putchar('\n');
diff --git a/sbin/tunefs/tunefs.8 b/sbin/tunefs/tunefs.8
index a883cd4..c62af94 100644
--- a/sbin/tunefs/tunefs.8
+++ b/sbin/tunefs/tunefs.8
@@ -28,7 +28,7 @@
 .\"     @(#)tunefs.8 8.2 (Berkeley) 12/11/93
 .\" $FreeBSD$
 .\"
-.Dd March 6, 2010
+.Dd December 9, 2010
 .Dt TUNEFS 8
 .Os
 .Sh NAME
@@ -51,6 +51,7 @@
 .Op Fl p
 .Op Fl s Ar avgfpdir
 .Op Fl S Ar size
+.Op Fl t Cm enable | disable
 .Ar special | filesystem
 .Sh DESCRIPTION
 The
@@ -143,6 +144,10 @@ Specify the expected number of files per directory.
 .It Fl S Ar size
 Specify the softdep journal size in bytes.
 The minimum is 4M.
+.It Fl t Cm enable | disable
+Turn on/off the TRIM enable flag.
+If enabled, and if the underlaying device supports BIO_DELETE
+command, kernel will delete the freed blocks.
 .El
 .Pp
 At least one of the above flags is required.
diff --git a/sbin/tunefs/tunefs.c b/sbin/tunefs/tunefs.c
index b2ea602..0ed713d 100644
--- a/sbin/tunefs/tunefs.c
+++ b/sbin/tunefs/tunefs.c
@@ -82,11 +82,13 @@ int
 main(int argc, char *argv[])
 {
  char *avalue, *jvalue, *Jvalue, *Lvalue, *lvalue, *Nvalue, *nvalue;
+ char *tvalue;
  const char *special, *on;
  const char *name;
  int active;
  int Aflag, aflag, eflag, evalue, fflag, fvalue, jflag, Jflag, Lflag;
  int lflag, mflag, mvalue, Nflag, nflag, oflag, ovalue, pflag, sflag;
+ int tflag;
  int svalue, Sflag, Svalue;
  int ch, found_arg, i;
  const char *chg[2];
@@ -101,7 +103,8 @@ main(int argc, char *argv[])
  evalue = fvalue = mvalue = ovalue = svalue = Svalue = 0;
  active = 0;
  found_arg = 0; /* At least one arg is required. */
- while ((ch = getopt(argc, argv, "Aa:e:f:j:J:L:l:m:N:n:o:ps:S:")) != -1)
+ while ((ch = getopt(argc, argv, "Aa:e:f:j:J:L:l:m:N:n:o:ps:S:t:"))
+    != -1)
  switch (ch) {
 
  case 'A':
@@ -268,6 +271,18 @@ main(int argc, char *argv[])
  Sflag = 1;
  break;
 
+ case 't':
+ found_arg = 1;
+ name = "trim";
+ tvalue = optarg;
+ if (strcmp(tvalue, "enable") != 0 &&
+    strcmp(tvalue, "disable") != 0) {
+ errx(10, "bad %s (options are %s)",
+    name, "`enable' or `disable'");
+ }
+ tflag = 1;
+ break;
+
  default:
  usage();
  }
@@ -493,6 +508,24 @@ main(int argc, char *argv[])
  sblock.fs_avgfpdir = svalue;
  }
  }
+ if (tflag) {
+ name = "issue TRIM to the disk";
+ if (strcmp(tvalue, "enable") == 0) {
+ if (sblock.fs_flags & FS_TRIM)
+ warnx("%s remains unchanged as enabled", name);
+ else {
+ sblock.fs_flags |= FS_TRIM;
+ warnx("%s set", name);
+ }
+ } else if (strcmp(tvalue, "disable") == 0) {
+ if ((~sblock.fs_flags & FS_TRIM) == FS_TRIM)
+ warnx("%s remains unchanged as disabled", name);
+ else {
+ sblock.fs_flags &= ~FS_TRIM;
+ warnx("%s cleared", name);
+ }
+ }
+ }
 
  if (sbwrite(&disk, Aflag) == -1)
  goto err;
@@ -1011,12 +1044,13 @@ out:
 void
 usage(void)
 {
- fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n",
+ fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n%s\n",
 "usage: tunefs [-A] [-a enable | disable] [-e maxbpg] [-f avgfilesize]",
 "              [-J enable | disable] [-j enable | disable]",
 "              [-L volname] [-l enable | disable] [-m minfree]",
 "              [-N enable | disable] [-n enable | disable]",
-"              [-o space | time] [-p] [-s avgfpdir] special | filesystem");
+"              [-o space | time] [-p] [-s avgfpdir] [-t enable | disable]",
+"              special | filesystem");
  exit(2);
 }
 
@@ -1035,6 +1069,8 @@ printfs(void)
  (sblock.fs_flags & FS_SUJ)? "enabled" : "disabled");
  warnx("gjournal: (-J)                                     %s",
  (sblock.fs_flags & FS_GJOURNAL)? "enabled" : "disabled");
+ warnx("trim: (-p)                                         %s",
+ (sblock.fs_flags & FS_TRIM)? "enabled" : "disabled");
  warnx("maximum blocks per file in a cylinder group: (-e)  %d",
       sblock.fs_maxbpg);
  warnx("average file size: (-f)                            %d",
diff --git a/sys/dev/md/md.c b/sys/dev/md/md.c
index 6c3484e..c294d1b 100644
--- a/sys/dev/md/md.c
+++ b/sys/dev/md/md.c
@@ -713,11 +713,12 @@ md_kthread(void *arg)
  }
  mtx_unlock(&sc->queue_mtx);
  if (bp->bio_cmd == BIO_GETATTR) {
- if (sc->fwsectors && sc->fwheads &&
+ if ((sc->fwsectors && sc->fwheads &&
     (g_handleattr_int(bp, "GEOM::fwsectors",
     sc->fwsectors) ||
     g_handleattr_int(bp, "GEOM::fwheads",
-    sc->fwheads)))
+    sc->fwheads))) ||
+    g_handleattr_int(bp, "GEOM::candelete", 1))
  error = -1;
  else
  error = EOPNOTSUPP;
diff --git a/sys/geom/geom_disk.c b/sys/geom/geom_disk.c
index 25d2e3b..eda702c 100644
--- a/sys/geom/geom_disk.c
+++ b/sys/geom/geom_disk.c
@@ -297,6 +297,9 @@ g_disk_start(struct bio *bp)
  } while (bp2 != NULL);
  break;
  case BIO_GETATTR:
+ if (g_handleattr_int(bp, "GEOM::candelete",
+    (dp->d_flags & DISKFLAG_CANDELETE) != 0))
+ break;
  if (g_handleattr_int(bp, "GEOM::fwsectors", dp->d_fwsectors))
  break;
  else if (g_handleattr_int(bp, "GEOM::fwheads", dp->d_fwheads))
diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c
index b740bbb..926229c 100644
--- a/sys/ufs/ffs/ffs_alloc.c
+++ b/sys/ufs/ffs/ffs_alloc.c
@@ -83,6 +83,8 @@ __FBSDID("$FreeBSD$");
 
 #include <security/audit/audit.h>
 
+#include <geom/geom.h>
+
 #include <ufs/ufs/dir.h>
 #include <ufs/ufs/extattr.h>
 #include <ufs/ufs/quota.h>
@@ -1850,6 +1852,7 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  u_int cg;
  u_int8_t *blksfree;
  struct cdev *dev;
+ struct bio *bip;
 
  cg = dtog(fs, bno);
  if (devvp->v_type == VREG) {
@@ -1962,6 +1965,16 @@ ffs_blkfree(ump, fs, devvp, bno, size, inum, dephd)
  softdep_setup_blkfree(UFSTOVFS(ump), bp, bno,
     numfrags(fs, size), dephd);
  bdwrite(bp);
+
+ if (ump->um_candelete) {
+ bip = g_alloc_bio();
+ bip->bio_cmd = BIO_DELETE;
+ bip->bio_offset = dbtob(fsbtodb(fs, bno));
+ bip->bio_done = g_destroy_bio;
+ bip->bio_length = size;
+ g_io_request(bip,
+    (struct g_consumer *)devvp->v_bufobj.bo_private);
+ }
 }
 
 #ifdef INVARIANTS
diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c
index 72f40da..f578382 100644
--- a/sys/ufs/ffs/ffs_vfsops.c
+++ b/sys/ufs/ffs/ffs_vfsops.c
@@ -895,6 +895,21 @@ ffs_mountfs(devvp, mp, td)
     mp->mnt_stat.f_mntonname);
 #endif
  }
+ if ((fs->fs_flags & FS_TRIM) != 0) {
+ size = sizeof(int);
+ if (g_io_getattr("GEOM::candelete", cp, &size,
+    &ump->um_candelete) == 0) {
+ if (!ump->um_candelete)
+ printf(
+"WARNING: %s: TRIM flag on fs but disk does not support TRIM\n",
+    mp->mnt_stat.f_mntonname);
+ } else {
+ printf(
+"WARNING: %s: TRIM flag on fs but cannot get whether disk supports TRIM\n",
+    mp->mnt_stat.f_mntonname);
+ ump->um_candelete = 0;
+ }
+ }
 
  ump->um_mountp = mp;
  ump->um_dev = dev;
diff --git a/sys/ufs/ffs/fs.h b/sys/ufs/ffs/fs.h
index ba98ed3..13d9ede 100644
--- a/sys/ufs/ffs/fs.h
+++ b/sys/ufs/ffs/fs.h
@@ -417,6 +417,7 @@ CTASSERT(sizeof(struct fs) == 1376);
 #define FS_FLAGS_UPDATED 0x0080 /* flags have been moved to new location */
 #define FS_NFS4ACLS 0x0100 /* file system has NFSv4 ACLs enabled */
 #define FS_INDEXDIRS 0x0200 /* kernel supports indexed directories */
+#define FS_TRIM 0x0400 /* issue BIO_DELETE for deleted blocks */
 
 /*
  * Macros to access bits in the fs_active array.
diff --git a/sys/ufs/ufs/ufsmount.h b/sys/ufs/ufs/ufsmount.h
index 03edd73..b13db40 100644
--- a/sys/ufs/ufs/ufsmount.h
+++ b/sys/ufs/ufs/ufsmount.h
@@ -95,6 +95,7 @@ struct ufsmount {
  time_t um_itime[MAXQUOTAS]; /* inode quota time limit */
  char um_qflags[MAXQUOTAS]; /* quota specific flags */
  int64_t um_savedmaxfilesize; /* XXX - limit maxfilesize */
+ int um_candelete; /* devvp supports TRIM */
  int (*um_balloc)(struct vnode *, off_t, int, struct ucred *, int, struct buf **);
  int (*um_blkatoff)(struct vnode *, off_t, char **, struct buf **);
  int (*um_truncate)(struct vnode *, off_t, int, struct ucred *, struct thread *);

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Pawel Jakub Dawidek
On Thu, Dec 09, 2010 at 12:34:11PM +0200, Kostik Belousov wrote:
> I still think that TRIM should be opt-in, at least for the first time.
> Below is the BIO_GETATTR change integrated.
>
> Now, both FS_TRIM should be set and device shall report the
> GEOM::candelete attribute as true to have BIO_DELETE issued. geom_disk
> and md(4) report GEOM::candelete. Also, I updated tunefs(8) manpage,
> missed in the previous version.
[...]
> --- a/sbin/tunefs/tunefs.c
> +++ b/sbin/tunefs/tunefs.c
[...]
> @@ -1035,6 +1069,8 @@ printfs(void)
>   (sblock.fs_flags & FS_SUJ)? "enabled" : "disabled");
>   warnx("gjournal: (-J)                                     %s",
>   (sblock.fs_flags & FS_GJOURNAL)? "enabled" : "disabled");
> + warnx("trim: (-p)                                         %s",

s/-p/-t/

Other than that looks nice and clean.

--
Pawel Jakub Dawidek                       http://www.wheelsystems.com
[hidden email]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Kirk McKusick-2
In reply to this post by Konstantin Belousov
Other than the nit pointed out by Pawel, the diffs look good to me.
You should consider adding the -t option to newfs so that the TRIM
option can be specified at the time the filesystem is created (as
a general rule, anything you can do with tunefs should also be
possible with newfs).

I agree with your decision to let administrators opt-out of doing
TRIM. If experience shows it to be generally useful to have it on,
we can change the default to enabled later. If we do change the
default to enabled, then we will want to delete the warning about TRIM
not being supported by the underlying disk that you added at mount
time as we would start getting a lot of them for all the non-SSD disks.

        Kirk McKusick
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Julian Elischer-5
On 12/9/10 10:13 AM, Kirk McKusick wrote:

> Other than the nit pointed out by Pawel, the diffs look good to me.
> You should consider adding the -t option to newfs so that the TRIM
> option can be specified at the time the filesystem is created (as
> a general rule, anything you can do with tunefs should also be
> possible with newfs).
>
> I agree with your decision to let administrators opt-out of doing
> TRIM. If experience shows it to be generally useful to have it on,
> we can change the default to enabled later. If we do change the
> default to enabled, then we will want to delete the warning about TRIM
> not being supported by the underlying disk that you added at mount
> time as we would start getting a lot of them for all the non-SSD disks.
>
> Kirk McKusick
>
I echo the above.
I am very happy to see this happen, and if it goes into 8.x I'll try see
that we (fusion-io) support it in our FreeBSD driver as soon as we can.

Several people have asked why trim is important for flash based devices..

The answer is that 'trimmed' packets get added back to the 'free pool
of pages available for a flash device to use and that helps in two ways.
firstly the average free space recoverable when you wring out a used
block of sectors increases, meaning that you have to garbage collect
fewer of them, which means that you have less overhead eating into
your performance, and secondly, when you start off after a period of
low utilisation, you have more headroom to write into before you need
to start garbage collection.  it's a win-win..



_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Julian Elischer-5
In reply to this post by Konstantin Belousov
On 12/9/10 2:34 AM, Kostik Belousov wrote:

[patch]


commit when ready.. I'm looking forwards to being able to try it out.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Bakul Shah
In reply to this post by Kirk McKusick-2
On Thu, 09 Dec 2010 10:13:39 PST Kirk McKusick <[hidden email]>  wrote:

> Other than the nit pointed out by Pawel, the diffs look good to me.
> You should consider adding the -t option to newfs so that the TRIM
> option can be specified at the time the filesystem is created (as
> a general rule, anything you can do with tunefs should also be
> possible with newfs).
>
> I agree with your decision to let administrators opt-out of doing
> TRIM. If experience shows it to be generally useful to have it on,
> we can change the default to enabled later. If we do change the
> default to enabled, then we will want to delete the warning about TRIM
> not being supported by the underlying disk that you added at mount
> time as we would start getting a lot of them for all the non-SSD disks.

Would be nice if something like ftrim(fd, offset, size) or
trim(path, offsetm size) or TRIM file ioctl is added, to free
up blocks undelying a given range in a file.  ftruncate can
delete blocks at the end but there is no facility to lose
blocks in the middle.  Mainly handy for virtual disks and
databases (and would work nicely with SEEK_DATA, SEEK_HOLE).


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Jeremy Chadwick
In reply to this post by Julian Elischer-5
On Thu, Dec 09, 2010 at 04:23:56PM -0800, Julian Elischer wrote:
> On 12/9/10 2:34 AM, Kostik Belousov wrote:
>
> [patch]
>
> commit when ready.. I'm looking forwards to being able to try it out.

And how does one actually verify TRIM is working?  Most SSDs I've used
do not have a SMART attribute that can help determine this.  I would
rather not make a bunch of assumptions about Attributes 226, 232, or 233
to help give some indication of it working.

For folks to test this, there needs to be some indicator/verification
mechanism that it's working.

--
| Jeremy Chadwick                                   [hidden email] |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Kirk McKusick-2
In reply to this post by Bakul Shah
> To: Kirk McKusick <[hidden email]>
> cc: Kostik Belousov <[hidden email]>, [hidden email],
>         [hidden email], Oliver Fromme <[hidden email]>
> Subject: Re: TRIM support for UFS?
> Date: Thu, 09 Dec 2010 16:37:48 -0800
> From: Bakul Shah <[hidden email]>
> X-ASK-Info: Message Queued (2010/12/09 16:37:54)
> X-ASK-Info: Confirmed by User (2010/12/09 16:39:17)
>
> Would be nice if something like ftrim(fd, offset, size) or
> trim(path, offsetm size) or TRIM file ioctl is added, to free
> up blocks undelying a given range in a file.  ftruncate can
> delete blocks at the end but there is no facility to lose
> blocks in the middle.  Mainly handy for virtual disks and
> databases (and would work nicely with SEEK_DATA, SEEK_HOLE).

We have discussed adding such a system call over the years (actually
we called it `release' rather than `trim'). If such a call ever does
get added, any blocks that it actually manages to release will get
passed through to BIO_DELETE by the changes that Kostik is soon to
add to the system.

        Kirk McKusick
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: TRIM support for UFS?

Pawel Jakub Dawidek
In reply to this post by Bakul Shah
On Thu, Dec 09, 2010 at 04:37:48PM -0800, Bakul Shah wrote:

> On Thu, 09 Dec 2010 10:13:39 PST Kirk McKusick <[hidden email]>  wrote:
> > Other than the nit pointed out by Pawel, the diffs look good to me.
> > You should consider adding the -t option to newfs so that the TRIM
> > option can be specified at the time the filesystem is created (as
> > a general rule, anything you can do with tunefs should also be
> > possible with newfs).
> >
> > I agree with your decision to let administrators opt-out of doing
> > TRIM. If experience shows it to be generally useful to have it on,
> > we can change the default to enabled later. If we do change the
> > default to enabled, then we will want to delete the warning about TRIM
> > not being supported by the underlying disk that you added at mount
> > time as we would start getting a lot of them for all the non-SSD disks.
>
> Would be nice if something like ftrim(fd, offset, size) or
> trim(path, offsetm size) or TRIM file ioctl is added, to free
> up blocks undelying a given range in a file.  ftruncate can
> delete blocks at the end but there is no facility to lose
> blocks in the middle.  Mainly handy for virtual disks and
> databases (and would work nicely with SEEK_DATA, SEEK_HOLE).
libgeom(3) provides:

        int g_delete(int fd, off_t offset, off_t length);

--
Pawel Jakub Dawidek                       http://www.wheelsystems.com
[hidden email]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

attachment0 (203 bytes) Download Attachment
12
Loading...