Skip to content

pipe jBin to sed to remove trailing empty columns#37

Closed
davidworkman9 wants to merge 4 commits intodbashford:masterfrom
davidworkman9:master
Closed

pipe jBin to sed to remove trailing empty columns#37
davidworkman9 wants to merge 4 commits intodbashford:masterfrom
davidworkman9:master

Conversation

@davidworkman9
Copy link
Contributor

I had a problem where XLS and XLSX files had reference to a column number way off in the distance with nothing in it (probably from someone applying formatting an entire row). I started hitting max buffer limits, and I couldn't even find a number that would handle it without hitting it! So instead I pipe the output of jBin to sed and remove all the trailing empty columns in a row.

I also opened an issue with J here, and it sounds like there might eventually be a switch to have it to this itself, or even be the default behaviour, so this could be removed when that becomes available.

@davidworkman9
Copy link
Contributor Author

Bump.

@dbashford
Copy link
Owner

Not familiar enough with windows to know if sed is installed by default?

I imagine it isn't. I struggle with this because of the added external dependency. Can you provide an example problematic xls for me to play around with?

@davidworkman9
Copy link
Contributor Author

I can understand not wanting to add an external dependency but I think there are other things in this library that aren't windows compatible. I'll work on getting a document that demonstrates this issue. Would you merge a request that didn't use sed unless it was not running on windows?

@dbashford
Copy link
Owner

I've been trying to reduce the number of external dependencies, not grow them. I had more, like unzip, but systematically pushed through nuking those. Some, the ones that just outright do the extraction, are hard to avoid.

Ideally the solution is one performed in node. I wonder if the exec was turned into a spawn that the stream could be worked with and avoid buffer issues?

@davidworkman9
Copy link
Contributor Author

@davidworkman9
Copy link
Contributor Author

I've been trying to reduce the number of external dependencies, not grow them.

Makes sense.

I wonder if the exec was turned into a spawn that the stream could be worked with and avoid buffer issues?

That's probably a workable solution. I would still see value in getting rid of the trailing empty columns, but at least that could be done in JS then. I'll see if I can implement that.

@dbashford
Copy link
Owner

I would still see value in getting rid of the trailing empty columns, but at least that could be done in JS then.

Yep! That's the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants