Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created the functionality to split the columns for issue #85 #92

Merged
merged 13 commits into from
Oct 7, 2023

Conversation

puneetsharma04
Copy link
Contributor

@SemyonSinchenko : Could you please review the code changes and give the comments.

@SemyonSinchenko
Copy link
Collaborator

@MrPowers what do you think about it? Should we have this functionality in a separate file? Or better to add it directly to functions.py?

@MrPowers
Copy link
Collaborator

@SemyonSinchenko - I'm fine having this functionality in a separate file. It's actually nice having a separate file. There should be like 10 tests, so having the separation is nice.

@puneetsharma04 - can you please add an assertion to the test? Can you also add different tests to cover all the edge cases? Thank you!

@MrPowers
Copy link
Collaborator

xref: #85

@puneetsharma04
Copy link
Contributor Author

@MrPowers @SemyonSinchenko : Thanks for the review comments , i will make the code changes to include test case and commit the changes.

@puneetsharma04
Copy link
Contributor Author

@SemyonSinchenko & @MrPowers : Please review the test cases and share you review & comments.

quinn/split_columns.py Outdated Show resolved Hide resolved
quinn/split_columns.py Outdated Show resolved Hide resolved
@SemyonSinchenko
Copy link
Collaborator

@puneetsharma04 I checked again discussion inside #85 and I understand it in the following way:

  1. If the mode is strict we should raise an IndexOutOfBound exception if after split we have less values than number of names in new_col_names
  2. If the mode is permissive we should replace missing valus by null or default (default should be an optional parameter in signature)

Also I want to see an ability to split column not only two variables but to any amount of variables.

Example 1:

  • mode: strict
  • intitial value "val1_val2"
  • new_col_names: ["col1", "col2", "col3"]
  • RESULT: ArrayIndexOfBoundException (or something like this)
    Example 2:
  • mode: strict
  • intitial value "val1_val2_val3"
  • new_col_names: ["col1", "col2", "col3"]
  • RESULT: col1=val1, col2=val2, col3=val3
    Example 3:
  • mode: permissive
  • default: "default"
  • intitial value "val1_val2"
  • new_col_names: ["col1", "col2", "col3"]
  • RESULT: col1=val1, col2=val2, col3=default

@puneetsharma04
Copy link
Contributor Author

@SemyonSinchenko : Thanks for sharing the review comments.Let me refactor the code and share again.
Could you please check the below given scenario and share the comments:
mode: strict
intitial value: "val1_val2"
new_col_names: ["col1", "col2"]
RESULT: for some records if val2 is null or blank, then how would you like it to be handled ?

@SemyonSinchenko
Copy link
Collaborator

@SemyonSinchenko : Thanks for sharing the review comments.Let me refactor the code and share again.
Could you please check the below given scenario and share the comments:
mode: strict
intitial value: "val1_val2"
new_col_names: ["col1", "col2"]
RESULT: for some records if val2 is null or blank, then how would you like it to be handled ?

In this case we should raise an Exception because mode is strict. Strict mode means that you should check the length of result of the split.

@puneetsharma04
Copy link
Contributor Author

@SemyonSinchenko : Could you please review the code changes and share your comments if it matches the expectations.

@SemyonSinchenko
Copy link
Collaborator

@SemyonSinchenko : Could you please review the code changes and share your comments if it matches the expectations.

A lot of changes in the last commit.. I do it in a weekend. Sorry for such a delay: a lot of work :(

@MrPowers MrPowers requested a review from SemyonSinchenko May 7, 2023 23:53
@MrPowers MrPowers merged commit 988efd1 into mrpowers-io:main Oct 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants