-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/leaky splits #203
Merged
Merged
Feature/leaky splits #203
Changes from 51 commits
Commits
Show all changes
60 commits
Select commit
Hold shift + click to select a range
c8207ce
initial commit
d8c85e2
after much deliberation, quick implementation, and shell of lengthier…
0a5412b
small fixes
d4d2b99
added basic filpath hash functionality
07d06a7
refactor - very wip
a2b9594
some fixes
d13a494
to views implmented
d87d769
implemented leaks for hash
e42aaf5
sklearn backend basic functionality implemented and integrated
110a1c0
made the hash backend give out an ordered view
8f630bb
cache leak view after first time it's computed
6b4eaec
cache leak view after first time it's computed
4733226
some documentation and cleanup
276489b
far better caching mechanism
a923c6a
filter res so it's actually leaks and not just sim
80fe828
bugfix
855c9ac
more bugfixes
f6a6652
added model kwargs to leaky splits sklearn backend
fe28ce7
wrote main function
7d4a552
removed remove_leaks, replaced it with view_without_leaks
d55c9c7
cleanup and documentation
09b5e51
cleanup and documentation
7449545
removed patches
2d86e79
added checks for non empty support and no overlap when providing spli…
5b56182
refactor + bugfix sometimes a sample would be kept even when it had n…
a607ff5
fixed accessing previous brain runs
7c741ee
updated main function and fixed serialization bug
12b0975
typo
a390118
added cleanup
41190c4
another probably redundant optimization check
fe89ea7
a lot of thinking and not a lot of writing code
ee20404
optimized leak finding
07d5bd3
removed old code
2a63a37
updated docs
a5ad99c
moved compute function to __init__
24b3a29
updated docs
f53022c
removed more old code
4047459
moved similarity registration out of class, doesn't make sense for it…
3b54720
documentation fixes
644b8d5
cleaned up imports
eb34ca0
dealt with leaks by sample edge case
3e3ddb8
assume loading of brain run happens correctly
d2bdbd6
changed variable name
13364f7
made the ethod name lowercase
080e8d7
renamed leaks_by_sample to leaks_for_sample
54ecb5a
renamed view_without_leaks to no_leak_view
ff08fd3
updated docs
a666b58
compute embeddings on the fly
e1a7b4f
changed method type property
a352898
fixed passing tags
1cf9b5f
fixed order of precedence for defaults, similarity conf dict, and arg…
c445865
made id2split internal
9a189e0
throw warning when a considered sample is not in any of the splits
49059c9
added warnings for view matching heuristics
992767e
updated docs to reflect importance of arguments
27be7fa
changed variable for clarity
4f6f4a0
removed unnused variable
7d9e418
changed leaks to leaks_view
a12c349
made tag leaks use tag_samples
9c51d46
changed _to_views docs
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this code results in all sample embeddings being computed:
When in practice only 1000 need embeddings. I think this is fine, just calling it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below.