UnicodeDecodeError: Can't decode from latin1 #8

pabloab · 2023-12-26T22:19:55Z

I was getting

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 24: invalid continuation byte

0xd1 is "Ñ" on latin1, which is consistent with the output dbfview -t -b myfile.dbf > myfile.txt; dbfile myfile.txt.

I tried to

dbf2csv -ie 'latin1'  'myfile.dbf'

which I believe should fix the issue, but it doesn't.

git clone https://github.com/akadan47/dbf2csv.git
cd dbf2csv
pip install -r requirements.txt
python setup.py install

dbf2csv --version
# dbf2csv 1.3

Workaround

I ended up converting with LibreOffice

sudo apt install libreoffice-base
libreoffice --headless --convert-to csv myfile.dbf  # Will generate myfile.csv
iconv -f ISO-8859-15 -t UTF-8 myfile.csv  > myfile-utf8.csv

I validated the CSV generated with frictionless.

For multiple files:

find . -type f -execdir libreoffice --headless --convert-to csv "{}" +
for FILE in *.csv; do iconv -f ISO-8859-15 -t UTF-8 "$FILE" -o "${FILE%%.*}-utf8.csv"; done

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnicodeDecodeError: Can't decode from latin1 #8

UnicodeDecodeError: Can't decode from latin1 #8

pabloab commented Dec 26, 2023 •

edited

Loading

UnicodeDecodeError: Can't decode from latin1 #8

UnicodeDecodeError: Can't decode from latin1 #8

Comments

pabloab commented Dec 26, 2023 • edited Loading

Workaround

pabloab commented Dec 26, 2023 •

edited

Loading