Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skiprows parameter returns error in read_pandas #84

Open
akcssb opened this issue Nov 2, 2023 · 0 comments
Open

skiprows parameter returns error in read_pandas #84

akcssb opened this issue Nov 2, 2023 · 0 comments

Comments

@akcssb
Copy link

akcssb commented Nov 2, 2023

pandas read_parquet() function has an optional parameter skiprows. read_pandas() does not accept this as a parameter, returning the following error:


TypeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 dp.read_pandas(filstier[2], skiprows=5)

File ~/stat-ofi-skatteregnskap/.venv/lib/python3.10/site-packages/dapla/pandas.py:37, in read_pandas(gcs_path, file_format, columns, **kwargs)
30 gcs_path = FileClient._remove_gcs_uri_prefix(gcs_path)
32 parquet_ds = pq.ParquetDataset(
33 gcs_path,
34 filesystem=fs,
35 use_legacy_dataset=False,
36 )
---> 37 return parquet_ds.read_pandas(columns=columns).to_pandas(
38 split_blocks=False, self_destruct=True, **kwargs
39 )
40 elif file_format == "json":
41 return read_json(gcs_path, storage_options=get_storage_options(), **kwargs)

File ~/stat-ofi-skatteregnskap/.venv/lib/python3.10/site-packages/pyarrow/array.pxi:687, in pyarrow.lib._PandasConvertible.to_pandas()

TypeError: to_pandas() got an unexpected keyword argument 'skiprows'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant