Skip to content

Commit d5aee30

Browse files
committed
2 parents 0ded7df + 976b4db commit d5aee30

File tree

3 files changed

+16
-23
lines changed

3 files changed

+16
-23
lines changed

docs/getting_started/api.md

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,7 @@ The cost for each API request is calculated as:
4040
api_cost = (num_train_rows + num_test_rows) * num_cols * n_estimators
4141
```
4242

43-
Where `n_estimators` is by default:
44-
- 4 for classification tasks
45-
- 8 for regression tasks
43+
Where `n_estimators` is by default 4 for classification tasks and 8 for regression tasks.
4644

4745
### Monitoring Usage
4846

@@ -56,8 +54,6 @@ Track your API usage through response headers:
5654

5755
## Current Limitations
5856

59-
### Data Privacy and Security
60-
6157
!!! warning "Important Data Guidelines"
6258
- Do NOT upload any Personally Identifiable Information (PII)
6359
- Do NOT upload any sensitive or confidential data
@@ -68,17 +64,13 @@ Track your API usage through response headers:
6864
### Size Limitations
6965

7066
1. Maximum total cells per request must be below 100,000:
71-
```python
72-
max_cells = (num_train_rows + num_test_rows) * num_cols
7367
```
74-
75-
2. For regression with full output (`return_full_output=True`), the number of test samples must be below 500:
76-
```python
77-
if task == 'regression' and return_full_output and num_test_samples > 500:
78-
raise ValueError("Cannot return full output for regression with >500 test samples")
68+
(num_train_rows + num_test_rows) * num_cols < 100,000
7969
```
8070

81-
These limits will be increased in future releases.
71+
2. For regression with full output turned on (`return_full_output=True`), the number of test samples must be below 500.
72+
73+
These limits will be relaxed in future releases.
8274

8375
### Managing User Data
8476

@@ -99,11 +91,11 @@ The API uses standard HTTP status codes:
9991
| 400 | Invalid request |
10092
| 429 | Rate limit exceeded |
10193

102-
Example error response:
94+
Example response, when limit reached:
10395
```json
10496
{
10597
"error": "API_LIMIT_REACHED",
10698
"message": "Usage limit exceeded",
10799
"next_available_at": "2024-01-07 00:00:00"
108100
}
109-
```
101+
```

docs/getting_started/install.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
You can access our models through our API (https://github.com/automl/tabpfn-client) or via our user interface built on top of the API (https://www.ux.priorlabs.ai/).
1+
You can access our models through our API (https://github.com/automl/tabpfn-client), via our user interface built on top of the API (https://www.ux.priorlabs.ai/) or locally.
22

33
=== "Python API Client (No GPU, Online)"
44

@@ -28,4 +28,4 @@ You can access our models through our API (https://github.com/automl/tabpfn-clie
2828
!!! warning
2929
R support is currently under development.
3030
You can find a work in progress at [TabPFN R](https://github.com/robintibor/R-tabpfn).
31-
Looking for contributors!
31+
Looking for contributors!

docs/getting_started/intended_use.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,17 @@
33
!!! note
44
For a simple example getting started with classification see [classification tutorial](../tutorials/classification.md).
55

6-
We provide a comprehensive demo notebook that guides through installation and functionalities at [Interactive Colab Tutorial (with GPU usage)](https://tinyurl.com/tabpfn-colab-local) and [Interactive Colab Tutorial (without GPU usage)](https://tinyurl.com/tabpfn-colab-online).
6+
We provide two comprehensive demo notebooks that guides through installation and functionalities. One [colab tutorial using the cloud](https://tinyurl.com/tabpfn-colab-online) and one [colab tutorial using the local GPU](https://tinyurl.com/tabpfn-colab-local).
77

88
### When to use TabPFN
99

10-
TabPFN excels in handling small to medium-sized datasets with up to 10,000 samples and 500 features. For larger datasets, approaches such as CatBoost, XGB, or AutoGluon are likely to outperform TabPFN.
10+
TabPFN excels in handling small to medium-sized datasets with up to 10,000 samples and 500 features. For larger datasets, methods such as CatBoost, XGBoost, or AutoGluon are likely to outperform TabPFN.
1111

1212
### Intended Use of TabPFN
1313

14-
While TabPFN provides a powerful drop-in replacement for traditional tabular data models, achieving top performance on real-world problems often requires domain expertise and the ingenuity of data scientists. Data scientists should continue to apply their skills in feature engineering, data cleaning, and problem framing to get the most out of TabPFN.
14+
TabPFN is intended as a powerful drop-in replacement for traditional tabular data prediction tools, where top performance and fast training matter.
15+
It still requires data scientists to prepare the data using their domain knowledge.
16+
Data scientists will see benefits in performing feature engineering, data cleaning, and problem framing to get the most out of TabPFN.
1517

1618
### Limitations of TabPFN
1719

@@ -21,7 +23,7 @@ While TabPFN provides a powerful drop-in replacement for traditional tabular dat
2123

2224
### Computational and Time Requirements
2325

24-
TabPFN is computationally efficient and can run on consumer hardware for most datasets. Training on a new dataset is recommended to run on a GPU as this speeds it up significantly. However, TabPFN is not optimized for real-time inference tasks.
26+
TabPFN is computationally efficient and can run inference on consumer hardware for most datasets. Training on a new dataset is recommended to run on a GPU as this speeds it up significantly. TabPFN is not optimized for real-time inference tasks, but V2 can perform much faster predictions than V1 of TabPFN.
2527

2628
### Data Preparation
2729

@@ -33,5 +35,4 @@ TabPFN's predictions come with uncertainty estimates, allowing you to assess the
3335

3436
### Hyperparameter Tuning
3537

36-
TabPFN provides strong performance out-of-the-box without extensive hyperparameter tuning. If you have additional computational resources, you can further optimize TabPFN's performance using random hyperparameter tuning or the Post-Hoc Ensembling (PHE) technique.
37-
38+
TabPFN provides strong performance out-of-the-box without extensive hyperparameter tuning. If you have additional computational resources, you can automatically tune its hyperparameters using [post-hoc ensembling](https://github.com/PriorLabs/tabpfn-extensions/tree/main/src/tabpfn_extensions/post_hoc_ensembles) or [random tuning](https://github.com/PriorLabs/tabpfn-extensions/tree/main/src/tabpfn_extensions/hpo).

0 commit comments

Comments
 (0)