Quantcast

OpenAFS on FreeBSD 8.1

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

OpenAFS on FreeBSD 8.1

Jan Henrik Sylvester
I did not expect my problems to have vanished, but I wanted to try again.

Should I use the git based port
http://stuff.mit.edu/afs/sipb.mit.edu/user/kaduk/freebsd/openafs/openafs-devel.shar.txt 
you pointed me to earlier for testing? Or should I always use
http://web.mit.edu/freebsd/openafs/openafs.shar that you posted to the
Quarterly Status Report?

With both, I run into the same problem compiling on FreeBSD 8.1.
http://svn.freebsd.org/viewvc/base?view=revision&revision=209524 changed
the definition of ifa_ifwithnet. In rx/rx_kernel.h, FreeBSD 8.1 needs
the same definition of rx_ifaddr_withnet as AFS_OBSD46_ENV (while
FreeBSD 8.0 needs the generic one). Should FreeBSD 8.0 still be supported?

With the git based port, I get an error on "kldload libafs": "can't load
libafs: Exec format error" (missing symbol?) -- openafs-1.5.75 (the
other port) does not seem to have this problem.

Starting afsd, I realized that I had not updated my CellServDB and thus
tried to shutdown afsd, which complained about afs still being mounted.
Trying to umount /afs, I got a segfault in the kernel. (I had not
actually accessed /afs before doing that.) I guess restarting the afsd
is not possible for now. (No big deal.)

I listed a few directories without blocks for longer periods of time as
with my last testing. Good. Copying a huge file from AFS was terribly
slow (even for my DSL connection), but it steadily progressed and I was
able to abort it without deadlocking or crashing. Copying a 16MB file to
AFS blocked a parallel "ls -l" on the same directory I was copying to,
but it eventually finished. The file was not corrupted. Great.

The main differences besides being on FreeBSD 8.1 now and using a newer
version of the OpenAFS port are that this time I was testing from a slow
DSL connection (over a WLAN) and not the LAN connection in my university
and I was testing against the AFS of a different department. I will try
to repeat under the same conditions as the last tests (aside from the
software versions) later.

pagsh does not immediately crash anymore -- another improvement, even if
it is minor compared to FreeBSD not crashing anymore using AFS.

BTW: Thanks for all your work!

Cheers,
Jan Henrik
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-afs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: OpenAFS on FreeBSD 8.1

Jan Henrik Sylvester
On 07/23/2010 12:30, Jan Henrik Sylvester wrote:
> I listed a few directories without blocks for longer periods of time as
> with my last testing. Good. Copying a huge file from AFS was terribly
> slow (even for my DSL connection), but it steadily progressed and I was
> able to abort it without deadlocking or crashing. Copying a 16MB file to
> AFS blocked a parallel "ls -l" on the same directory I was copying to,
> but it eventually finished. The file was not corrupted. Great.

I did more testing from University to both of the AFS' I had been
testing before. Copying a few MB from AFS and copying a 16MB file to AFS
was both fine (showing 6MB/s while copying).

Trying to copy a 512MB file to AFS locked all AFS after two seconds that
it was showing copy rates of 40MB/s (while the network is only
100Mbit/s). After increasing the AFS cache size to 512MB, almost all of
the file got copied before AFS would lock. With a cache of 1GB, the file
got copied without a deadlock or corruption. (All this is on MP, I have
not tried to disable all but one core.)

Rebooting the machine after having done nothing but the successful copy
of the 512MB file, I got:
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 05
fault virtual address   = 0x290
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff805959ae
stack pointer           = 0x28:0xffffff807500c6c0
frame pointer           = 0x28:0xffffff807500c6e0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1944 (afsd)
trap number             = 12
panic: page fault
cpuid = 3

Overall, the only problems I got during my tests were copying files
larger than the cache size and shutting down afsd. So far, AFS seems to
become usable for me (even on MP).

Thanks again,
Jan Henrik
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-afs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: OpenAFS on FreeBSD 8.1

Benjamin Kaduk-2
Hi Jan,

Sorry for the long delay in responding -- mail piled up a bit during a
busy week.

On Fri, 23 Jul 2010, Jan Henrik Sylvester wrote:

> On 07/23/2010 12:30, Jan Henrik Sylvester wrote:
>> I listed a few directories without blocks for longer periods of time as
>> with my last testing. Good. Copying a huge file from AFS was terribly
>> slow (even for my DSL connection), but it steadily progressed and I was
>> able to abort it without deadlocking or crashing. Copying a 16MB file to
>> AFS blocked a parallel "ls -l" on the same directory I was copying to,

I'm pretty sure that we're holding an exclusive vnode lock when we're not
supposed to, but haven't looked into why the lock diagnostics don't
complain about it.

>> but it eventually finished. The file was not corrupted. Great.
>
> I did more testing from University to both of the AFS' I had been testing
> before. Copying a few MB from AFS and copying a 16MB file to AFS was both
> fine (showing 6MB/s while copying).
>
> Trying to copy a 512MB file to AFS locked all AFS after two seconds that it
> was showing copy rates of 40MB/s (while the network is only 100Mbit/s). After
> increasing the AFS cache size to 512MB, almost all of the file got copied
> before AFS would lock. With a cache of 1GB, the file got copied without a
> deadlock or corruption. (All this is on MP, I have not tried to disable all
> but one core.)

Do you remember if this was with the git-based port or the 1.5.75 linked
from the status report?  The latter has an extra patch which band-aids
around a reference-counting bug when we need to reclaim used vnodes due to
a space crunch.

>
> Rebooting the machine after having done nothing but the successful copy of
> the 512MB file, I got:
> Fatal trap 12: page fault while in kernel mode

Hm, hard to do much about that without a backtrace.  I've seen occasional
errors when shutting down afsd (various manifestations), but I'd say it
completes successfully at least half the time (umount -f, that is).

>
> Overall, the only problems I got during my tests were copying files larger
> than the cache size and shutting down afsd. So far, AFS seems to become
> usable for me (even on MP).

Glad to hear things are getting better.



On Fri, 23 Jul 2010, Jan Henrik Sylvester wrote:

>
> I did not expect my problems to have vanished, but I wanted to try again.
>
> Should I use the git based port
> http://stuff.mit.edu/afs/sipb.mit.edu/user/kaduk/freebsd/openafs/openafs-devel.shar.txt 
> you pointed me to earlier for testing? Or should I always use
> http://web.mit.edu/freebsd/openafs/openafs.shar that you posted to the
> Quarterly Status Report?

I would probably stick to the git-based port, as that will give more
useful reports when things break (such as the one you mention below).  As
I mentioned above, there is one patch in the latter shar which is not in
git; it's http://gerrit.openafs.org/2321 .  You can add it to the
git-based port by stopping after the 'make patch' stage, going into the
work directory and running:
git pull git://git.openafs.org/openafs refs/changes/21/2321/1
and then proceeding with the configure, build, and install stages.

>
> With both, I run into the same problem compiling on FreeBSD 8.1.
> http://svn.freebsd.org/viewvc/base?view=revision&revision=209524 changed
> the definition of ifa_ifwithnet. In rx/rx_kernel.h, FreeBSD 8.1 needs
> the same definition of rx_ifaddr_withnet as AFS_OBSD46_ENV (while
> FreeBSD 8.0 needs the generic one). Should FreeBSD 8.0 still be supported?
>

I'll try to get that fix in this weekend (if not sooner).  I only have
9-current test boxes, and I think Derrick only has 8.0, so 8.1-specific
things would otherwise rely on me noticing relevant changes in the commit
emails that go by; this doesn't work very well when I don't have much time
to read them :)

> With the git based port, I get an error on "kldload libafs": "can't load
> libafs: Exec format error" (missing symbol?) -- openafs-1.5.75 (the
> other port) does not seem to have this problem.
>

Sounds like someone introduced a regression since then; thanks for the
report.

> Starting afsd, I realized that I had not updated my CellServDB and thus
> tried to shutdown afsd, which complained about afs still being mounted.
> Trying to umount /afs, I got a segfault in the kernel. (I had not
> actually accessed /afs before doing that.) I guess restarting the afsd
> is not possible for now. (No big deal.)
>

It ... should be possible, though it is not fully reliable.  Be sure to
unload and reload the kernel module between unmounting /afs and restarting
afsd, though.


-Ben Kaduk


>
> pagsh does not immediately crash anymore -- another improvement, even if
> it is minor compared to FreeBSD not crashing anymore using AFS.
>
> BTW: Thanks for all your work!
>
> Cheers,
> Jan Henrik
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-afs
> To unsubscribe, send any mail to "[hidden email]"
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-afs
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: OpenAFS on FreeBSD 8.1

Benjamin Kaduk-2
On Wed, 28 Jul 2010, Benjamin Kaduk wrote:

> Hi Jan,
>
> Sorry for the long delay in responding -- mail piled up a bit during a busy
> week.
>
>
> On Fri, 23 Jul 2010, Jan Henrik Sylvester wrote:
>
>>
>>
>> With both, I run into the same problem compiling on FreeBSD 8.1.
>> http://svn.freebsd.org/viewvc/base?view=revision&revision=209524 changed
>> the definition of ifa_ifwithnet. In rx/rx_kernel.h, FreeBSD 8.1 needs the
>> same definition of rx_ifaddr_withnet as AFS_OBSD46_ENV (while FreeBSD 8.0
>> needs the generic one). Should FreeBSD 8.0 still be supported?
>>
>
> I'll try to get that fix in this weekend (if not sooner).  I only have
> 9-current test boxes, and I think Derrick only has 8.0, so 8.1-specific
> things would otherwise rely on me noticing relevant changes in the commit
> emails that go by; this doesn't work very well when I don't have much time to
> read them :)

That fix is in the tree -- thanks!

>
>> With the git based port, I get an error on "kldload libafs": "can't load
>> libafs: Exec format error" (missing symbol?) -- openafs-1.5.75 (the other
>> port) does not seem to have this problem.
>>
>
> Sounds like someone introduced a regression since then; thanks for the
> report.
>

This one proves to be quite a bit more difficult; if you look at the
console when you try to load the module (or in dmesg), it complains about
a particular undefined symbol, afs_FlushVS.  This function is supposed to
be called when we are short on cache space (and other things) and need to
reclaim space.  However ... it isn't implemented anywhere.  This codepath
changed with commit d29643b0553011cbe60dd127fd31c1847fe02ddb, which
enabled disconnected mode always for unix clients -- the old version (for
non-disconnected mode) used a different function.  It will probably be a
while before we rewrite things properly (as it is not exactly clear what
"properly" means, at least to me).

-Ben
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-afs
To unsubscribe, send any mail to "[hidden email]"
Loading...