-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tests for streams #11
Comments
is there a way to parse a character stream? I have a use case where I want to parse a stream of lines that are not succeeded by a line break, but proceeded. There can be significant wait between the individual lines, so I need to parse and process a line before it's terminating new line is sent. Most line based streaming stuff breaks on that unfortunately, so I imagine a Stream[Char] would be the right thing here. |
If I have a Stream[String] with strings of size 1, does atto apply it parser to each one or to the beginning of the whole stream, across strings? I am playing with writing a tool that parses scalac output, ignores bogus type errors based on heuristics, pretty prints types, etc. Scalac doesn't to \n after it's type errors, but before apparently. Or it's sbt. |
So, yeah if you use the existing process combinator it will feed each string to the parser and emit values as they are complete (saving any remaining input) and either discard errors or halt on error (depending on which combinator you use). It's straightforward to write a custom processor though ... the current approach handles two possible use cases but it may not match what you're doing. If you want to describe it in a bit more detail I can give you a more precise answer. |
sbt prints
then waits, no \n following the ^. at some later point the next
arrives. I need to parse the first [error]......^ section without waiting for a \n following the ^. |
does atto call the parser on each element of the string individually or does it effectively turn the Stream[String] into a Stream[Char] and run the parser on that? |
The parser consumes strings, which it treats logically as chunks of characters but is more efficient. On success there may be leftover input, which the stream processor uses as the initial input for parsing the next chunk. For example, the result here includes the residual input: scala> int.sepBy(char('.')).parse("128.42.32.12 woozle")
res2: atto.ParseResult[List[Int]] = Done( woozle,List(128, 42, 32, 12)) |
I would need something vaguely like this:
Parse a single parseable value off the stream of characters, return it and the remainder of the stream |
Doesn't look like atto does that right now. |
Easy enough to hack up. As always it will come down to details. import atto._, Atto._
import ParseResult._
def chunk[A](chars: Stream[Char], p: Parser[A]): (ParseResult[A], Stream[Char]) = {
def go(s: Stream[Char], pr: ParseResult[A]): (ParseResult[A], Stream[Char]) =
pr match {
case Done(_, _) => (pr, s)
case Fail(_, _, _) => (pr, s)
case Partial(_) =>
s match {
case c #:: cs => go(cs, pr.feed(c.toString))
case _ => (pr.done, s)
}
}
go(chars, p.parse(""))
}
scala> chunk("123,33,111242346456".toStream, long <~ (char(',') || endOfInput))
res16: (atto.ParseResult[Long], Stream[Char]) = (Done(,123),Stream(3, ?)) You can use this to define a def chunks[A](chars: Stream[Char], p: Parser[A]): Stream[A] =
chunk(chars, p) match {
case (Done(cs, a), s) => a #:: chunks(cs.toStream ++ s, p)
case _ => Stream.Empty // or something
}
scala> chunks("123,33,111242346456".toStream, long <~ (char(',') || endOfInput))
res17: Stream[Long] = Stream(123, ?)
scala> res17.toList
res18: List[Long] = List(123, 33, 111242346456) |
there are a lot of corner cases depending on chunking of input, so it would be really nice to have fuzz tests for streams
The text was updated successfully, but these errors were encountered: