-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move src/parser.c
to git LFS
#273
Comments
We could reconsider whether parser.c should be committed at all. It is auto-generated with tree-sitter generate, so I am not sure it should be in the repository. |
Not committing it would be even better. But I was under the impression that this is a de facto requirement of a tree sitter grammar repo, because this way all you have to do is check out the grammar’s git repository into a well-known location1. I'm not sure if this is a hard requirement or if we can deviate from it. Footnotes |
@damieng I just checked the recent size increases of
Why is there a conflict? Why isn't EDIT: For future reference, I asked the same question in tree-sitter/tree-sitter#2024. |
I don't know enough about how tree-sitter resolves these precedence rules to be helpful here unfortunately (every time I read the docs I feel like I've got a good handle on it but then once I dig into our non-trivial rule set it escapes me) Whatever we do here it might be worth adding some kind of file size echo to a text file so that we can see at a glance on PR's how a change affects large generated file sizes. |
Re: I think nvim-treesitter and Emacs both have logic in them to compile shared libraries for parsers from files that live under In nvim-treesitter's case I think there may be logic to call In Emacs 29+'s case I don't think there is any logic ATM to run the |
We mostly use tar-balls of individual commits and download them via curl. With LFS, I don't know whether |
It looks like difftastic uses tree-sitter-c-sharp and assumes Not really sure though...but I would be surprised if @Wilfred didn't know :) |
I'm not sure how cursorless does its building, but I think it also transitively(?) uses tree-sitter-c-sharp. There's a custom May be @pokey knows the details. |
Here's where we build the wasm: https://github.com/cursorless-dev/vscode-parse-tree/blob/main/Makefile#L43 So, as long as That being said, I think we could probably add a step to generate parser.c before we compile the wasm, so should be ok. But I do believe every other tree-sitter grammar repo keeps the generated |
I think this should be a discussion across tree-sitter grammars in general. Having tree-sitter-c-sharp be different/unusual from other grammars in this respect will not help adoption. Unfortunately we don't really have a broad tree-sitter discussion channel. Perhaps we should start one. I've create a Discord one if anyone is interested https://discord.gg/KrRa5ATb |
Agreed. A lot of discussion seems to happen on https://github.com/tree-sitter/tree-sitter/discussions, so it's probably worth asking there. Agreed a Discord server could be useful, though. I just asked to see if there already was one (tree-sitter/tree-sitter#2027) |
I found this related discussion: tree-sitter/tree-sitter#1243. |
I tried
Memory usage exceeded 50GB here too... |
Yes, it has grown quite considerably lately. We're going to have to make some trade-offs between handling every single scenario possible in C# vs performance I suspect. |
Somewhat related, I noticed this issue at the tree-sitter-swift repository: alex-pinkus/tree-sitter-swift#149 |
That's a pretty interesting solution, I'd be up for doing that here. I wonder if we can proactively reach out to dependencies we know will be affected. |
The issue of it being non-trivial to contact users of one's grammar seems to be the sort of thing that many grammars will / do face. I wonder if there is a good approach to this. We started a sticky issue where we have began to announce potential upcoming changes and asked the users we know about to subscribe to it and let us know if things we're planning on could be a problem: sogaiu/tree-sitter-clojure#33 We tried to ascertain who was using us and then went around to the folks we came up with, but later it turned out there were other folks. From the perspective of a "user" (e.g. nvim-treesitter, Emacs 29+, difftastic, helix-editor, etc.) though, I would imagine that there would be a preference to not have different methods of being up-to-date about upcoming changes. Some kind of unified way seems like a better deal. I wonder if there's something that can be done at the tree-sitter repository...like a dedicated thread(?) / discussion per grammar? (I haven't found the Discussions UI / UX to be that great though -- stuff seems to get easily hidden and hard to search -- though possibly that has changed over time.) |
I wonder then if we flip what tree-sitter-swift did on its head and instead move development to a new We'd then adopt the tree-sitter-swift approach that when a release is tagged we generate the files and merge them into That way everyone who is relying on master and the generated artefacts today is unaffected other than lagging behind active development. Given there's only a small handful of contributors to the repo the work in us changing from developing against master to developing against |
Why not the other way around? Unless the user knows they can expect generated files to be inside the repository they don't expect it. Also updating a generated file adds major noise to the commit history. |
Improving the memory used to generate this grammar would be great. I'm at 26.5GB right now for some architectures I don't even have the memory to generate the grammar (Arm and PPC). |
We'd love to be able to improve the memory usage but the only options I'm aware of there are either:
We really don't want to do option 1 as it would be a breaking change and make this library less useful. As for option 2 I wouldn't know where to start and I suspect the only people who do are really busy on other things. |
I get the point, I wanted to highlight the amount of memory required to generate the current grammar. If it only would be possible to generate parts of the grammar in a separate process. |
It's definitely something that needs to be done but that would happen in tree-sitter itself not the C# grammar. |
I started receiving the below warning on pushes:
I'm not sure what would be the impact of moving to LFS. Maybe some documentation would need to be updated.
The text was updated successfully, but these errors were encountered: