-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiindex necessary? #410
Comments
Well for now, yes all 5 users. Do you use the multiindex? Simpler is always better exactly! So get rid of the multiindex! I'm not suggesting that things can't be done, it's just confusing for new users (something that will presumably be an issue in the future if we ever want other people to use it) and re-naming is a pain. I'm just trying to figure out why it was set up like this in the first place and why it remains necessary (and/or how hard it would be to get rid of it and keep VIS's functionality) |
Perhaps @crantila could speak to the logic behind the multi-index in its original conception. The utility of labelling data is sort of obvious but there are a couple of new directions that we have been taking with VIS that make the columnar multi-index either increasingly troublesome or unnecessary. It is more troublesome than before because if you want to filter one dataframe based on the values in another dataframe, you have to first remove the multi-indecies of both dataframes. We have only recently started to do this type of conditional filtering quite a lot. For example, say you have the noterest results of a piece saved to the variable nr, and the durations of those notes and rests saved to the variable dr. Both nr and dr are dataframes, and they are the same shape. But since their multi-indecies don't match (because the first level of nr is 'noterest.NoteRestIndexer' and the first level of dr is 'meter.DurationIndexer') you can't filter the notes for their durations until you remove or index through the first level of both multi-indecies. So, specifically if you were looking for all the instances of 'C4' that last at least a half note (2), you have to run this line of code:
But, if we didn't have multi-indexed columns, that same line of code would look like this:
@mborsodi , you were using conditional filtering quite a bit in your final projects, would you care to weigh in on this issue? Ok, so that was the ease-of-use consideration. In VIS 3, the multi-index will not be needed as much as it was before because most results will be stored as values in a dictionary of analyses stored as an attribute of an indexed_piece. Previously this caching was only done for the noterest indexer results. Since the key that you use to access a dataframe in this analysis dictionary via indexed_piece.get_data() is the name of the indexer in question, the label is still there, just not in the same place. It's not as if we have a bunch of dataframes just floating around without knowing what they correspond to. In fact, the dictionary keys are more precise because in addition to showing what indexer they were run with, the dataframes can be linked to the piece they were run on since they are associated with a specific indexed piece. |
This is related to #394 . |
In fact, the VIS isn't optimized to be easy to learn or use. If something you found on StackOverflow doesn't work with VIS, too bad, it's your responsibility to understand what you're doing and how to do it properly. If you don't like typing the long, namespaced indexer names, too bad, they aren't there for your sake to begin with. It's interesting for me to watch how VIS evolves over time. Before it was a Framework, VIS was a set of music21 scripts. As the GUI grew, the scripts evolved into an MVC-like structure where the data manipulation was still poorly-structured music21 scripts (VIS 1 through 9). As we dreamed of the web and analyses beyond counterpoint, we needed not specific to one UI, so we adopted Times have changed. RODAN learned how to handle things other than OMR, and now the robustness and predictability happens there. Unlikely though it was, more people started writing VIS scripts. Caching on remote servers has failed to materialize, but there's a new caching plan. Even though we've learned how to handle Multi-indexed columns were the simple solution. The situation has changed and those problems are solved in different (significantly more complicated) ways now, so the column names end up looking bizarre and superfluous. Let's shed a single tear: multi-indexed columns should go. |
@crantila , your post is both informative and hilarious. Bravo! |
Could you explain this a bit more? |
That's a strange way to say "please help us by working on the API!" @ahankinson VIS Framework was designed to run on a server as the primary thing in charge. That was the biggest influence on architectural decisions like the multi-indexed columns. Since I left, there's been a trend to decentralize complexity: server-side data flow is handled by RODAN; data visualization by VIS-Ualizer; nobody even remembered For the most part this means VIS itself can be simplified, but not always. Some of these changes are obviously better design decisions, some of them seem motivated by other concerns, but each one of them contributes to significantly more complicated overall solutions. |
Cleverly edited response. Snarky comments in produce snarky comments out. I'm sorry for criticizing decisions I don't understand. The point remains: the benefits you're seeing come at the cost of higher overall system complexity. If you're looking for substantial effort, I am available for contract work. Otherwise I would be happy to help with specific tasks. |
@crantila thanks for your comments. I think you make a good point about how the current solution of keeping the indexer and class information in the columnar multi-index is the simplest solution as far as the program architecture is concerned. So the ease of use we're looking for with the removal of this multi-index does come at the cost of somewhat more complexity in the backend. I hadn't thought of it that way but I agree with it. More generally, @crantila your feedback and/or critiques on this and any other topics are always welcome and appreciated. |
Thanks, Christopher. I was genuinely interested in your response to my question, and not trying to provoke anything. I think the intention for the melodic search is that it will be rolled into an indexer, but it's just sitting outside VIS while we figure it out. |
I was talking with Alex M. today about some annoyances that arise seemingly out of VIS's use of the multiindexed dataframes. Although Alex explained how the multiindex can be used, it seems that nobody is really making use of it. That is, everything most users do, they could do with just simple (non-multiindexed) pandas dataframes.
Getting rid of the extra "level" would make a lot of the indexing and renaming much simpler, no? Also, by removing the multiindex level, newer users to VIS would have an easier time manipulating objects and using the tools since there are a lot of tutorials, etc. for using pandas dataframes (but some of the manipulations as shown in those tutorials don't work within VIS because of the multiindexed dataframe). Thoughts?
The text was updated successfully, but these errors were encountered: