Skip to content

HTML/CSV export corrupts UTF-8 characters outside of Basic Multilingual Pane (BMP) ie code point >10000 #1197

@Yuutakasan

Description

@Yuutakasan

OpenRefine 2.7 rc2

After reading UTF 8 file and executing export as UTF 8 file, garbled characters occurred.

displayed characters
image
有限会社なべ茶屋あさ𡌛

Exported garbled characters
image
有限会社なべ茶屋あさ����

other garbled export charactor sample
𣘺𣳾

Metadata

Metadata

Assignees

Labels

CSV/TSVAbout the CSV/TSV import or exportPriority: HighDenotes issues that require urgent attention and may be blocking progress.Type: BugIssues related to software defects or unexpected behavior, which require resolution.encodingSelection of encoding at import time, or encoding issues in data cleaningexportExporting a project to some format. Use the format-specific sub-label if availableimportAbout importers in general - add a label for the data format if available

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions