perf: improved performance of TabularDataset.__eq__ by a factor of up to 2#697
perf: improved performance of TabularDataset.__eq__ by a factor of up to 2#697lars-reimann merged 5 commits intomainfrom
TabularDataset.__eq__ by a factor of up to 2#697Conversation
…up to 2 perf: slightly improved performance of `TabularDataset.__hash__` fix: corrected `TabularDataset.__sizeof__`
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
|
@lars-reimann do you really want to save the exact same data twice in a |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #697 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 66 66
Lines 4873 4873
=========================================
Hits 4873 4873 ☔ View full report in Codecov by Sentry. |
|
Due to We should add some performance benchmarks , though, to check runtime and memory use of the current implementation (not needed for this PR). |
|
Regarding column order: Should that matter for equality? If so, we should compare the tables and the names of target, features, and extras. |
In the |
…mprove-tabular-dataset
…up to 2 (#697) ### Summary of Changes perf: improved performance of `TabularDataset.__eq__` by a factor of up to 2 perf: slightly improved performance of `TabularDataset.__hash__` fix: corrected `TabularDataset.__sizeof__` --------- Co-authored-by: megalinter-bot <129584137+megalinter-bot@users.noreply.github.com>
## [0.24.0](v0.23.0...v0.24.0) (2024-05-09) ### Features * `Column.plot_histogram()` using `Table.plot_histograms` for consistent results ([#726](#726)) ([576492c](576492c)) * `Regressor.summarize_metrics` and `Classifier.summarize_metrics` ([#729](#729)) ([1cc14b1](1cc14b1)), closes [#713](#713) * `Table.keep_only_rows` ([#721](#721)) ([923a6c2](923a6c2)) * `Table.remove_rows` ([#720](#720)) ([a1cdaef](a1cdaef)), closes [#698](#698) * Add `ImageDataset` and Layer for ConvolutionalNeuralNetworks ([#645](#645)) ([5b6d219](5b6d219)), closes [#579](#579) [#580](#580) [#581](#581) * added load_percentage parameter to ImageList.from_files to load a subset of the given files ([#739](#739)) ([0564b52](0564b52)), closes [#736](#736) * added rnn layer and TimeSeries conversion ([#615](#615)) ([6cad203](6cad203)), closes [#614](#614) [#648](#648) [#656](#656) [#601](#601) * Basic implementation of cell with polars ([#734](#734)) ([004630b](004630b)), closes [#712](#712) * deprecate `Table.add_column` and `Table.add_row` ([#723](#723)) ([5dd9d02](5dd9d02)), closes [#722](#722) * deprecated `Table.from_excel_file` and `Table.to_excel_file` ([#728](#728)) ([c89e0bf](c89e0bf)), closes [#727](#727) * Larger histogram plot if table only has one column ([#716](#716)) ([31ffd12](31ffd12)) * polars implementation of a column ([#738](#738)) ([732aa48](732aa48)), closes [#712](#712) * polars implementation of a row ([#733](#733)) ([ff627f6](ff627f6)), closes [#712](#712) * polars implementation of table ([#744](#744)) ([fc49895](fc49895)), closes [#638](#638) [#641](#641) [#649](#649) [#712](#712) * regularization for decision trees and random forests ([#730](#730)) ([102de2d](102de2d)), closes [#700](#700) * Remove device information in image class ([#735](#735)) ([d783caa](d783caa)), closes [#524](#524) * return fitted transformer and transformed table from `fit_and_transform` ([#724](#724)) ([2960d35](2960d35)), closes [#613](#613) ### Bug Fixes * make `Image.clone` internal ([#725](#725)) ([215a472](215a472)), closes [#626](#626) ### Performance Improvements * improved performance of `TabularDataset.__eq__` by a factor of up to 2 ([#697](#697)) ([cd7f55b](cd7f55b))
|
🎉 This PR is included in version 0.24.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
Summary of Changes
perf: improved performance of
TabularDataset.__eq__by a factor of up to 2perf: slightly improved performance of
TabularDataset.__hash__fix: corrected
TabularDataset.__sizeof__