Links

Akkio FAQ

Frequently Asked Questions about the platform
This section contains common questions we have come across.

Account and Settings

Do my paid actions apply to my team?

Yes, actions purchased by an account owner are used by their teams. Users on your team who have a separate team they own will not use your actions for that team, however.

Is there a limit to the number of users on my team?

No.

Can I delete my account?

While trial accounts automatically convert to free accounts and will not charge you unless you upgrade, you can always delete your account and data from the Account page. This can be accessed from the cog on the app's home page.

Building a Model

What’s the best way to deal with class imbalance in Akkio?

The main thing we do for class imbalance is that we use non-parametric models like XGBoost or Random Forests etc. Non-parametric models have much less of an issue with class imbalance. You can do SMOTE or random oversampling with chat data prep, but I would not recommend it because it’ll likely be worse in production than the defaults.

When working with text data, is there a way to ensure certain words are excluded from modeling?

No, Akkio doesn’t have the ability to exclude prediction outcomes based on input works or allow users to select exclusion words.

What kind of NLP algorithms does Akkio use?

Akkio’s algorithms look at 256 features of text (e.g., words, order, length, etc.). Akkio focuses on learning the user’s business language.

What is the Akkio Baseline model?

, The baseline is Akkio guessing (predicting) the most frequent class in a dataset, and the comparison to baseline is Akkio’s way of showing how Akkio’s selected model does compare to predicting the most frequent class (e.g., 5.6x better than baseline means we were 5.6 times better than baseline)

Does Akkio use Bayesian models?

We don’t use any Bayes or Naive Bayes models. We use Neural Networks, Random Forests, Linear and Logistic Regression, among other models.

Does Akkio use SMOTE or class weight for dealing with imbalanced classification?

No, Akkio uses model architectures that remove the need for it.

Can I see what models were tested and how they performed during training?

We do not currently support that, but it is a roadmap feature.

When running predictions, not every field used to train will always be filled in; how does Akkio handle these empty fields?

We treat them as null fields; if there are matching null fields in the training set, we look for patterns from there.

Chat Explore

How should I handle missing data during EDA/Data Prep?

Akkio is robust to missing data and will tell a user how accurate the model is with missing data.
Users can improve their model’s performance by providing more data or doing data cleaning/imputations with Chat Explore.

Is Chat Explore case-sensitive?

No, while the tool is evolving and there will be limitations on its understanding, it is not case-sensitive.

Does Chat Explore work on merged data?

As of now, no. After merging the data in Akkio, the best thing to do is download the merged dataset, reupload it, and then Chat Explore can be done.
In the future, merge will be part of data prep, and then Chat Explore will work on merged datasets.

If I clean data from my integrated source, can I feed it back into that source?

Yes, you can deploy a data prep workflow back into the integration it came from.

Can I white-label the shareable content generated from Chat Explore?

Yes, all shareable content can be white-labeled on plans that allow white-labeling.

Deploying a Model

How does Akkio address multicollinearity in the data?

Akkio doesn’t remove multicollinearity beforehand but addresses it in the modeling step by trying various models that are sensitive or insensitive to multicollinearity.

How does Akkio avoid overfitting on a model during training?

Akkio uses k-fold cross-validation to avoid model overfitting.

Is Feature Correlations something Akkio does?

This is an upcoming feature but is not currently supported.

Does Akkio automatically scale data (e.g., log) in the modeling phase?

Depending on the data distribution, Akkio might apply a log transform.

Does Akkio show which Time Series algorithm was selected for time series modeling?

At present, we don't show this. However, it is on our roadmap.
Can a model be tweaked where a regressor is added? Can the user configure the model?
This is something our Engineering is discussing. From my understanding, Engineering doesn't think this would be too hard to implement
How does Akkio determine Top Fields?
Top Fields are determined by how much the field (column) corresponds to how much the predicted value changes as the full field (column) changes. Similar to Permutation Importance.

Setting up Integrations

Are there any limitations on using an integration with a free trial?

No. If a user is having trouble connecting with one of the pre-built integrations, they might not have been given sufficient permissions, or there may be an authorization error.

When using an integration, is data moved from the integrated system into Akkio?

Yes, data is moved into Akkio and stored natively.

I have a database that doesn't integrate with Akkio; what are my options?

The API can connect to other systems; we are also always working to expand our native integrations, and we encourage you to reach out to support with requests for new integrations.

How much data can Akkio handle for Excel files?

Akkio is designed to handle large amounts of data - you can upload the maximum size Excel file (1 million rows). If you have more data use a CSV or one of our integrations instead.

Can I use a demo Snowflake account with Akkio?

No, as noted before, you can use a free Akkio trial with integrations, but the Snowflake free demo does not function with Akkio.

Data Security and Compliance

Does Akkio encrypt data at rest and in flight?

Yes, Akkio encrypts data at rest and in flight.

Does Akkio support having a secure VPN tunnel between Akkio and a data source?

No, however, we are SOC 2 Type II compliant.

What data does Akkio share with OpenAI (GPT)?

Akkio does not send data to OpenAI. We use GPT via a private Azure deployment.

Is my data retained or used independently of my instance?

No data is retained or used to create training sets, update the platform, etc.

Does Akkio support having a TLS tunnel between a data store and an AWS instance of Akkio?

Yes, we do, and we inherit Amazon or Google security.

Is Akkio GDPR Compliant?

This is coming very soon; if this is a requirement for your business, please get in touch with support.

About Akkio modeling and predictions

What types of statistical techniques are these models using to make predictions?

We use several modeling methods, including Neural Networks, Random Forests, and Decision Trees. Those are described as such:
Neural networks model complex input-target relationships using linear and non-linear transformations optimized by gradient descent.
Random forests use bagging and feature randomness to combine the outputs of multiple decision trees for higher accuracy and reduced overfitting.
Decision trees recursively split input data based on feature values, aiming for homogeneous target variable subsets determined by techniques like entropy, Gini impurity, or information gain.

How do we know what assumptions the model makes, whether generalizable or even statistically significant?

Different algorithms make varying assumptions about the data distribution. Non-parametric models like decision trees and random forests make fewer assumptions, while neural networks assume differentiability in input-target relationships. Though statistical significance isn't directly evaluated, performance metrics like accuracy, precision, recall, and F1 score can be used to assess a model's effectiveness.

How do we handle multicollinearity and singularity, and outliers?

Multicollinearity is addressed within the platform to help remove redundant features and improve model performance.
Singularity, often caused by a high degree of correlation between features or perfect collinearity, can be resolved by removing one of the collinear features.
We are generally robust to outliers, but if necessary, they can be removed with chat data prep or the soon-to-be-launched data cleaning tool. Some models, like decision trees and random forests, are less sensitive to outliers than others.

Are the modeling processes transparent?

We provide insight into the driving factors for all models as part of the model creation process. We call this the insights report, which makes the model decision-making more transparent.
Our platform aims to provide ML capabilities without the need for code, so the transparency comes in these reports in digestible form. More detail can be found by drilling into the advanced sections of the report.

API

What is the API response time/volumes it can handle?

Five requests per second. However, these requests can be bulk calls, making the API handling more case-by-case. Please feel free to contact support for your specific use case.