Idea for NEURON's Internal Data Structures #2182
Replies: 12 comments
-
@ctrl-z-9000-times: first of all, thank you very much for writing a detailed proposal. Really appreciated! I only spent ~10 mins to go through the above and will need more time to think through various aspects. But, just want to say that we completely agree about data structures redesign is quite important to improve the NEURON (performance & maintainability). So we are definitely interested to have a conversation in this regard. Will get in touch with you! cc: @nrnhines @ramcdougal @alexsavulescu @ohm314 @olupton @iomaganaris |
Beta Was this translation helpful? Give feedback.
-
I've put together a more detailed and organized design document: https://ctrl-z-9000-times.github.io/homepage/database_design.html |
Beta Was this translation helpful? Give feedback.
-
This is a very timely and interesting proposal given that it incorporates in some ways a main aim I'm toying with with respect to planning the NEURON grant renewal. I.e. https://docs.google.com/document/d/1Ht5zR3XGHUeqj3URDDzGj7Hd_YIMF7DItnTCQMGx5Sg/edit?usp=sharing. Although I haven't had time to digest all the implications or magnitude of the change with respect to "DataBase" I'm keen on entering into discussion. |
Beta Was this translation helpful? Give feedback.
-
I read through your idea for "TrackingPointers" and the database should be able to do that. In fact, trying to do what TrackingPointers do was part of what made me realize that I need a database in the first place. |
Beta Was this translation helpful? Give feedback.
-
Hi @ctrl-z-9000-times , thank you for taking the time to come up with this proposal. We would definitely need more dedicated time than what we could fill into NEURON Dev Meeting. How about a first discussion this Friday, where maybe you could add a few slides with a high level overview/ architecture, together with some PROs and CONs of your solution? We can then see about a proper follow-up. |
Beta Was this translation helpful? Give feedback.
-
That sounds great. I will put together a presentation of the proposal. I'm available all day on Friday, so I will let you all pick the meeting time. |
Beta Was this translation helpful? Give feedback.
-
Just a few slides to be presented @ NEURON Dev Meeting to begin with. Could you give me your email address to that I can invite you ? |
Beta Was this translation helpful? Give feedback.
-
My email address is: [email protected] |
Beta Was this translation helpful? Give feedback.
-
You can find all details here: https://github.com/neuronsimulator/nrn/wiki/July-2021. Please let me know if the slides are unaccessible. |
Beta Was this translation helpful? Give feedback.
-
I got the zoom meeting link, thanks. |
Beta Was this translation helpful? Give feedback.
-
Thanks @ctrl-z-9000-times ! Just to mention about SoA, you can find our approach via coreneuron library in https://www.frontiersin.org/articles/10.3389/fninf.2019.00063/full. |
Beta Was this translation helpful? Give feedback.
-
During the developers meeting I mentioned an alternative method for solving linear differential equations.
Scipy implements the method to compute the matrix exponential: scipy.sparse.linalg.expm Edit: The method is called "exact" so why did I say it is less accurate than what NEURON currently does? |
Beta Was this translation helpful? Give feedback.
-
Hello NEURON Developers,
I am responding to your discussion about NEURON's internal data structures [1]. I have some ideas for how they could be improved.
My name is David McDougall. By training and profession I am a computer programmer [2]. However for the past several years my passion has been neuroscience. I have been observing the development of the NEURON simulator with interest, and at the same time developing my own neural simulator [3]. I believe that I get better at writing neural simulators with every try, and so after many attempts I think that I have found some good designs for implementing neural simulators.
NEURON's internal data structures are complicated and contain a large quantity data. Currently they are implemented as C structures which are connected by pointers. It is a working stable legacy code base. The improvements that I'm suggesting would require significant modifications.
EDIT: I wrote a more detailed and organized description: https://ctrl-z-9000-times.github.io/homepage/database_design.html
What NEURON needs is a piece of software to manage its data. A database. To make a database first you make a list of all of your data and a list of all of the ways you will interact with the data. Then you design the internal data structures and algorithms. Finally you design an API for using the database. I've done this design work. The remainder of this message is a description of my proposed database design.
Terminology
To begin with, the database must be tailored for running physics simulations. To that end it has special words for dealing with physical things: Archetypes, Entities, and Components. An Entity is a thing which exists in your simulation, such as a neuron, a segment, or the 3D volume inside of a segment. Each Entity has a type which is known as an Archetype, and the Archetype defines what data Components are attached to each Entity. The data Components are where all of the data actually lives. The database will need multiple types of Archetypes, Entities, and Components for representing different and specialized things.
Types of Components and Archetypes
Pointers
The database will need to contain pointers. However the database will never give the user a raw C pointer! Instead it present an Entity "Handle" which the database manages and retains access to. Because the database knows the location of every pointer, the database can move Entities & their Components to new memory locations and update all of the pointers. This is critically important for destroying entities and not leaving behind holes in your otherwise contiguous data arrays.
Graphics Cards
The database could be made to understand GPU memory systems. It can integrate with an existing GPU programming environment (such as CUDA) and facilitate writing GPU kernels for it. My prototype database use the "cupy" and "numba" libraries to transfer data to/from GPUs and to write GPU kernels, respectively.
Error Checking
You can add error checking to your components and have the database deal with running the checks. For example you can enforce a valid range of values, or to raise an error when it sees NaN or NULL pointers. These checks can be configured or disabled for each component. And of course you can have a global DEBUG flag to enable or disable error checking.
Documentation
As the central repository for the data, it makes sense to also store any descriptions of the data in the database too. The database can render nice looking documentation of itself, showing how it is organized. You can write extra documentation for any Archetype or Component and have your explanations show up in the rendered document.
You can specify Units for your data components. The database could understand how units work, and provide tools for converting between units. Another use for units in the database is labeling the axes of data plots with the correct units.
Conclusion
The primary benefit of using a database is that it gives you a way to think about data in the abstract. By putting all of the data into a single place you can see all of the commonalities. You can write software tools which do more things with fewer lines of code and with almost no redundant code duplication. New features can be enabled and applied to every datum from a central location.
All of these things can be implemented to run very efficiently. I've implemented a prototype which does most of these things with reasonable efficiency. My simulator is not yet complete but the database portion of it is mostly done. I'm planning on continuing my work on my project, but I could be persuaded to help to apply these ideas to the neural simulator of your choice.
If you are interested in discussing these ideas further then maybe we could talk at one of the NEURON developer meetings?
Sincerely,
David McDougall
[1] https://drive.google.com/file/d/1nZxxTcLrNxIBj2z6zPGufxlCHaBqndzl/view?usp=sharing
Overview of NEURON data structures.
[2] https://ctrl-z-9000-times.github.io/homepage/resume_2021.pdf
My resume. I am currently available for employment.
[3] https://github.com/ctrl-z-9000-times/NEUWON
Disclaimer, this is a work in progress and it does not have any documentation.
I've provided this link as proof of its existence, nothing more.
Beta Was this translation helpful? Give feedback.
All reactions