Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tab-separated columns #30

Open
gelisam opened this issue Aug 10, 2013 · 7 comments
Open

tab-separated columns #30

gelisam opened this issue Aug 10, 2013 · 7 comments

Comments

@gelisam
Copy link
Collaborator

gelisam commented Aug 10, 2013

hawk currently outputs tuples as space-separated columns by default. I think tabs should be the default.

@melrief
Copy link
Collaborator

melrief commented Aug 10, 2013

I agree but maybe not just tuples. Every container that contains more than one value should separate them with \t. For instance

> hawk -e '[(1, 2)]'
1\t2
> hawk -e '[["foo bar", "foo bar"]]'
foo bar\tfoo bar

A bonus is that you can print on the same line many tab-separated strings that are easy to parse:

> hawk -e '[["hello world","foo bar"]]'
hello world\tfoo bar
> hawk -e '[["hello world","foo bar"]]' | hawk -m "L.map (L.length . words) . split '\t'"
2\t2

@melrief
Copy link
Collaborator

melrief commented Aug 10, 2013

Considering I have already done the code to do this for hsp, if you both agree I can push it

@gelisam
Copy link
Collaborator Author

gelisam commented Aug 10, 2013

I have concerns about allowing a list to represent a single line, as this will interact badly with --magic.

As you know, the whole point of --magic is to use the type of the user's expression to infer whether the expression is supposed to be a map...

> printf "hi\nworld" | hawk --magic B.length
2
5

a fold...

> seq 3 | hawk --magic (+)
6

or an operation on a list of lines:

> seq 10 | hawk --magic 'L.drop 7'
8
9
10

But if we allow lists to represent the content of a single line, then it is no longer clear whether 'L.drop 7' is supposed to drop the first 7 lines or, interpreted as a map, the first 7 columns.

I think the first interpretation is going to be more common, so if we do allow lists to represent lines, magic won't complain that the type is ambiguous, it should instead try to do the most-probably-correct thing:

> hawk --magic '[[1..3], [4..6], [7..9]]'
1   2   3
4   5   6
7   8   9
> hawk --magic '[1..3]'
1
2
3
> hawk --magic '1'
1

Therefore, I am slightly worried that if we teach the user that a list of lists represents lines of tab-separated items, then the user will expect to be able to use --map to omit the outer list, and if we allow that, then the user will expect --magic to detect that a map is expected, and that won't work. It's a slippery slope.

> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk 'map L.reverse'
3   2   1
6   5   4
9   8   7
> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --map L.reverse
3   2   1
6   5   4
9   8   7
> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --magic L.reverse
7   8   9
4   5   6
1   2   3

Notice that the third attempt has a different behaviour than the first two.

@ssadler
Copy link
Owner

ssadler commented Aug 10, 2013

I prefer the distinction between lists and tuples, ie, lists items are separate lines, tuples are tab separated.

@melrief
Copy link
Collaborator

melrief commented Aug 10, 2013

@gelisam this is a delicate topic. I see the imput of a unix command as a list of list of strings, where the first list is the list of lines and the second list is the list of words in a single line. So for me the example hawk --magic L.reverse has the correct output. But --magic can still be very useful, if not essential, when the user specify a correct expression:

> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk 'L.map sum'
Couldn't match type `[[b0]]' with `Data.ByteString.Lazy.Internal.ByteString'
Expected type: Data.ByteString.Lazy.Internal.ByteString -> GHC.Types.IO ()
Actual type: [[b0]] -> GHC.Types.IO ()
> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --magic 'L.map sum'
6
15
24

For me, the user must understand the types involved in hawk. This is not optional and --magic can only relax this when it is clear what the user wants but won't delete this prerequisite. About the last example you did, magic cannot (and for me must not) infer automatically a map on a list of lists. The user must specify it:

> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --magic L.reverse
7   8   9
4   5   6
1   2   3
> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --magic 'L.map L.reverse'
3   2   1
6   5   4
9   8   7

I like this way to be honest. Eventually, a solution could be to let magic work with -d and -m:

> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk --magic L.reverse
7   8   9
4   5   6
1   2   3
> printf "1\t2\t3\n4\t5\t6\n7\t8\t9" | hawk -m --magic L.reverse
3   2   1
6   5   4
9   8   7

@ssadler hsp shows tuples as lists but hsl considers tuples very different from lists. We should discuss more on this, maybe opening a new issue just about tuples to clarify use cases about them? The system that shows a type is very easy to change.

@gelisam
Copy link
Collaborator Author

gelisam commented Aug 10, 2013

I'm fine with the behaviour of the above 6 examples.

@gelisam
Copy link
Collaborator Author

gelisam commented Aug 10, 2013

How about tab-separated tuples, but whitespace-separated lists?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants