Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missings #136

Open
alfredjmduncan opened this issue Jul 23, 2020 · 6 comments
Open

Missings #136

alfredjmduncan opened this issue Jul 23, 2020 · 6 comments

Comments

@alfredjmduncan
Copy link

Currently, plot functions accept data with type AbstractMatrix{Real}. This means that the code throws a MethodError when passed data that allows for or includes missing values.

When plotting multiple time series with different frequencies / time spans, there can be quite a bit of messy wrangling required before passing the data through to the plotting functions in PGFPlots.

If it were possible to accept data as AbstractMatrix{Union{Missing,Real}}, then for PGFPlots to drop the Missings for each trace before plotting, that would be much appreciated.

@mykelk
Copy link
Member

mykelk commented Jul 23, 2020

Great idea! We'd welcome a PR.

@alfredjmduncan
Copy link
Author

There are a few design questions. I guess the two main options are to

  1. To pass the missing s to PGFPlots as nan, which is a standard way to code missing values in PGFPlots. This would mean
  • Updating the plotHelper functions in PGFPlots.jl, then
  • updating the accepted Real / Complex types to Union{Real,Missing} / Union{Complex,Missing} throughout.

(It would also be possible to pass the missings as empty strings, which is more appealing than nans in some ways. But this only works in PGFPlots if values are delimited with commas or semicolons. In some plotHelper functions, values are currently delimited with spaces).

  1. Another option would be to just allow missings when passing a DataFrame to PGFPlots, and to just filter the missings out of the DataFrame columns provided before dispatching into the plotting functions. This would just require updating lines 45-51 of PGFPlots.jl.

(2) is a much smaller change, but drops some useful information from the resulting .tex output files. (1) would allow the user to set whether PGFPlots skips or jumps missing values, which is a useful feature in PGFPlots.

@mykelk
Copy link
Member

mykelk commented Jul 25, 2020

@tawheeler Do you have a preference?

@tawheeler
Copy link
Member

Julia now has core support for missing values. It seems to make sense to support that directly in PGFPlots.jl as well.

PGFPlots.jl is a weird package in that, rather than typing things, we basically don't add types to anything, and rely on the type itself to dictate how it gets serialized to text when writing to a .tex file.
The data itself is an exception to this. As @alfredjmduncan points out, the data fields are of type AbstractMatrix{Real}. I like the idea of moving to Union{Missing, Real}, and then using skipmissing in plotHelper.

@mykelk
Copy link
Member

mykelk commented Jul 25, 2020

Sounds good @tawheeler

@alfredjmduncan
Copy link
Author

OK great! I'll have a go at the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants