Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added test for file tree creation with large files #258

Closed

Conversation

mrbojangles3
Copy link

This is the PR alluded to in #256. I have added a test case and some supporting functions to trigger an issue I am seeing on my system. TestCreateFileTree is the case I am adding. This creates a 6GiB fat32 file system, so hopefully the CI doesn't fall over because of it.

This is a great project, and you have been very responsive. Thank you.

lblyth@host:~/repos/go-diskfs/filesystem/fat32$ go test                                                             
--- FAIL: TestCreateFileTree (8.11s)                                                                                 
panic: could not read directory entries for /b/sub50/blob [recovered]
        panic: could not read directory entries for /b/sub50/blob
                                                                                                                     
goroutine 263 [running]:                                                                                             
testing.tRunner.func1.2({0x6b7a80, 0xc00002e190})
        /home/lblyth/go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1632 +0x230
testing.tRunner.func1()
        /home/lblyth/go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1635 +0x35e
panic({0x6b7a80?, 0xc00002e190?})                                                                                    
        /home/lblyth/go/pkg/mod/golang.org/[email protected]/src/runtime/panic.go:785 +0x132
github.com/diskfs/go-diskfs/filesystem/fat32_test.mkGigFile({0x778b38?, 0xc0000f6630?}, {0x70d9bd?, 0x721bcd?})
        /home/lblyth/repos/go-diskfs/filesystem/fat32/fat32_test.go:1134 +0xa5
github.com/diskfs/go-diskfs/filesystem/fat32_test.TestCreateFileTree(0xc0000dc340?)
        /home/lblyth/repos/go-diskfs/filesystem/fat32/fat32_test.go:1162 +0x328
testing.tRunner(0xc0000dc340, 0x7227e0)
        /home/lblyth/go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1690 +0xf4
created by testing.(*T).Run in goroutine 1                                                                           
        /home/lblyth/go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1743 +0x390
exit status 2
FAIL    github.com/diskfs/go-diskfs/filesystem/fat32    8.574s  

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Heh, it didn't even get to your test, because listing failed. Please check that.

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Lint still is complaining. Run make lint locally?

@mrbojangles3
Copy link
Author

Well, the thing is, I have been. But I am getting lint errors in code I didn't touch (in the standard library) and the errors I am seeing here aren't the ones I have had show up in my local runs of make lint. I am on x86 running fedora.

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

What go version are you using?

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Just 3 lint errors left.

@mrbojangles3
Copy link
Author

go version go1.23.0 linux/amd64

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

So that should do it. Maybe you have a different version of golangci-lint. Either way, you can see the last 3 lint errors here

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Lint is all clean. Huzzah!

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Is the error what you expected?

@mrbojangles3
Copy link
Author

I am glad I got my local env matching the CI so this will hopefully NEVER happen again.

This is the "right" error.

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

"never" is such a strong word. 😆

Can you squash the commits to a single commit?

Do you understand the cause of the error that this test triggers?

@mrbojangles3
Copy link
Author

mrbojangles3 commented Sep 25, 2024

"never" is such a strong word. 😆

lol, yep

Squashed!

I do not understand what is causing this error. It was odd to see it pop up in our env. Based on our use case (making a boot-able image with two partitions) it only happened with the big files.

@deitch
Copy link
Collaborator

deitch commented Sep 25, 2024

Well, with an actual test case, we can try and solve it now.

@mrbojangles3
Copy link
Author

I am looking forward to learning more about FAT32 because of this.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Do you know why the tests take so long now? On master branch, it takes about 1s to run all of the fat32 tests. With this branch, about a minute.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Do you mind rebasing? I fixed some of the missing error reporting. Does not fix your issue, but does make the errors easier to see.

@mrbojangles3
Copy link
Author

I am not sure why the test time ballooned. I am making a 6GiB file so that might take time depending on whats happening on the local machine.

Good suggestion on the temp file. I assume there will be more changes so if its okay, I will hold off on the squashes until the end.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Better, thanks. Looking at it. If you can squash the commits into a single one, rather than having the merge-from-master commit, that would help.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Sure, we can wait for that.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

OK, I have some practical information. Up until you create the first gigfile, everything is fine. I checked the state of the file-tree.img file, and it looks good. For example, here are the starting bytes for cluster 2 (fat32 root):

00000000: 4e4f 204e 414d 4520 2020 2008 0000 8164  NO NAME    ....d
00000010: 3a59 3a59 0000 8164 3a59 0000 0000 0000  :Y:Y...d:Y......
00000020: 4120 2020 2020 2020 2020 2010 0000 8164  A          ....d
00000030: 3a59 3a59 0000 8164 3a59 0300 0000 0000  :Y:Y...d:Y......
00000040: 4162 0000 00ff ffff ffff ff0f 00c0 ffff  Ab..............
00000050: ffff ffff ffff ffff ffff 0000 ffff ffff  ................
00000060: 4220 2020 2020 2020 2020 2010 0000 8164  B          ....d
00000070: 3a59 3a59 0000 8164 3a59 0400 0000 0000  :Y:Y...d:Y......
00000080: 4172 006f 006f 0074 0066 000f 001a 6900  Ar.o.o.t.f....i.
00000090: 6c00 6500 0000 ffff ffff 0000 ffff ffff  l.e.............
000000a0: 524f 4f54 4649 4c45 2020 2000 0000 8164  ROOTFILE   ....d
000000b0: 3a59 3a59 0000 8164 3a59 0500 0b00 0000  :Y:Y...d:Y......
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

After the first gigfile:

00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Clearly, something there is wiping them clean. My first suspicion is that the calculations break down somewhere at a certain size. That should be something we can test.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Next data point. If I skip your whole tree, and just make the gigfile, it works fine. So it is some combination of the two.

@mrbojangles3
Copy link
Author

Can I ask a methodology question? - Are you using delve, breakpoints, then hexdumping the image?

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Exactly that. Although hexdumping in a separate window so I can work in parallel.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

It is an overflow. We use uint32 for the read offset here, and for the write offset here, but it can be up to 64 bits (filesystem offset calls on go usually take int64). Either way, I managed to recreate it where it gets to a position where it overflows, then starts overwriting back from the beginning.

PR coming soon.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

See #260 .

And I incorporated your tests in the. I rewrote them a bit to fit with the style, but they go right on in.

@mrbojangles3
Copy link
Author

Great find!
I admit, I wouldn't have suspected that overflow. Since fat32 came about in a time when there wasn't uint64. I figured all the offsets would be okay.

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Yeah, but filesystem code does all sorts of things with individual bytes.

Can we close this now?

@mrbojangles3
Copy link
Author

Yeah, but filesystem code does all sorts of things with individual bytes.

Can we close this now?

Would you be willing to tag this version ?

@deitch
Copy link
Collaborator

deitch commented Sep 26, 2024

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants