Classifium has developed novel techniques that classify tabular data with very high accuracy. For example, the services on this website can be employed to determine which ads to show a given potential customer, if a patient is sick or healthy or if a proposed high frequency trading transaction will be profitable or not.
Technological Leadership
If you want the best possible results for your data, you should use Classifium. Classifium can be used by anyone. On the other hand, you need to be a highly trained machine learning expert to get the best results from other world leading machine learning algorithms such as XGBoost. Thus, Classifium is the obvious choice to get the best classification results with minimum effort for you. For example, on the most well known benchmark for tabular data, Covertype, Classifium obtains an error rate of 2.41%, whereas XGBoost gives 2.47%. Both were run with 5-fold cross validation. Classifium has a similar accuracy advantage for most datasets.
Big Data
Classifium has relatively low memory consumption and can be used for very big datasets. It is implemented in highly optimized C code that uses multithreading at four different granularities to provide scalability to very many CPU cores, potentially hundreds, should they be available. Since everything is fully automatic, it may require more computing time to produce a model for a dataset. However, it is usually preferable that computers do all the work for you even if it requires big computing resources.
System Integration
Our servers generate ensembles of decision trees using the dataset you provide. Classifium reports the error rate and the confusion matrix that result from cross validation. You can then choose to download the trees to your own machine and utilize them for classification using the C source code that is provided on this web site. That C code is written to be small, clean and easy to use, but not necessarily as efficient as possible. You may modify and use the C source code in any way you may choose. Thus, once the trees are generated, you can use them free of charge in your own application for all future.
Security
Classifium aims to keep you and your data as confidential as possible under current law and does not store your e-mail or IP addresses or your name. Instead, you are assigned a numbered and anonymous account. Your dataset is erased from our discs as soon as it has been processed. The Classifium website does not employ cookies and is ad free. Our bare metal servers are located in Sweden and not available for log in over the internet, which very much reduces the exposure to security flaws like the side-channel attacks that still plague cloud based virtual machines.
In order to analyze your datasets, you need an account number, which is generated automatically after you have chosen your country and if applicable VAT number. Save the account number to a file and use it when running analyses. You also need to agree to our Terms of Service.
If you are in the EU or the European Economic Area, you need to specify from which country you will use this website.
When you start a run, you will have at least one 16-core AMD EPYC CPU for yourself. We use bare metal servers and no virtualization.