Links

Akkio FAQ

Frequently Asked Questions about the platform
This section contains common questions we have come across.

Account and Settings

Do my paid actions apply to my team?

Yes, actions purchased by an account owner are used by the teams they own. Users who are on your team but have a separate team they own will not use your actions for that team however.

Is there a limit to the number of users on my team?

No.

Can I delete my account?

While trial accounts automatically convert to free accounts and will not charge you unless you choose to upgrade you can also always delete your account and data from the Account page. This can be accessed from the cog on the home page of the app.

Building a Model

What’s the best way to deal with extreme class imbalance in Akkio?

We suggest reducing the majority class down to 10% or 2%, so Akkio can learn what causes the model to predict the minority class.

When working with text data, is there a way to make sure certain words are excluded from modeling?

No, Akkio doesn’t have the ability to exclude prediction outcomes based on input works or allow users to select exclusion words.

What kind of NLP algorithms does Akkio use?

Akkio’s algorithms look at 256 features of text (e.g. words, order, length, etc.). Akkio focuses on learning the user’s business language.

What is the Akkio Baseline model?

The baseline is Akkio guessing (predicting) the most frequent class in a dataset, and the comparison to baseline is Akkio’s way of showing how Akkio’s selected model does compared to predicting the most frequent class (e.g. 5.6x better than baseline means we were 5.6 times better than baseline)

Does Akkio use Bayesian models?

We don’t use any Bayes or Naive Bayes models .We use Neural Networks, Random Forests, Linear and Logistic Regression, among other models.

Does Akkio use SMOTE or class weight for dealing with imbalanced classification?

No, Akkio uses model architectures that remove the need for it.

Can I see what models were tested and how they performed during training?

We do not currently support that but it is a roadmap feature.

When running predictions not every field that was used to train will always be filled in, how does Akkio handle these empty fields?

We treat them as null fields and if there were matching null fields in the training set we look for patterns from there.

Chat Explore

How should I handle missing data during EDA/Data Prep?

Akkio is robust to missing data and will tell a user how accurate the model is with missing data.
A user can improve their model’s performance by providing more data or doing data cleaning/imputations with Chat Explore

Is Chat Explore case sensitive?

No, while the tool is evolving and there will be limitations on its understanding it is not case sensitive.

Does Chat Explore work on merged data?

As of now, no. The best thing to do is after merging the data in Akkio, download the merged dataset, reupload, and then Chat Explore can be done.
In the future, merge will be part of data prep, and then Chat Explore will work on merged datasets.

If I clean data from my integrated source can I feed it back into that source?

Yes, you can deploy a data prep workflow back into the integration it came from.

Can I white-label the shareable content generated from Chat Explore?

Yes all shareable content can be white-labeled on plans that allow white-labeling.

Deploying a Model

How does Akkio address multicollinearity in the data?

Akkio doesn’t remove multicollinearity beforehand but addresses it in the modeling step by trying a variety of models which are variously sensitive or insensitive to multicollinearity.

How does Akkio avoid overfitting on a model during training?

Akkio uses k-fold cross validation to avoid model overfitting.

Is Feature Correlations something Akkio does?

This is an upcoming feature but is not currently supported.

Does Akkio do any data scaling (e.g. log) automatically in the modeling phase?

Depending on the data distribution, Akkio might apply a log transform.
For time series modeling, does Akkio show which Time Series algorithm was selected?
At present, we don't show this, however, it is on our roadmap
Can a model be tweaked where a regressor is added? Can the model be configured by the user?
This is something our Engineering is discussing. From my understanding, Engineering doesn't think this would be too hard to implement
How does Akkio determine Top Fields?
Top Fields are determined by how much the field (column) corresponds to how much the predicted value changes as the top field (column) changes. Similar to Permutation Importance.

Setting up Integrations

Are there any limitations on using an integration with a free trial?

No. If a user is having trouble connecting with one of the pre-built integrations, they might have not given sufficient permissions or there is an authorization error

When using an integration is data moved from the integrated system into Akkio?

Yes, data is moved into Akkio and stored natively.

I have a database that doesn't have an integration with Akkio, what are my options?

The API can be used to connect to other systems, we are also always working to expand our native integrations, we encourage you to reach out to support with requests for new integrations.

How much data can Akkio handle for Excel files?

10 million rows.

Can I use a demo Snowflake account with Akkio?

No, you can as noted before use a free Akkio trial with integrations but the Snowflake free demo does not function with Akkio.

Data Security and Compliance

Does Akkio encrypt data at rest and in flight?

Yes, Akkio encrypts data at rest and in flight.

Does Akkio support having a secure VPN tunnel between Akkio and a data source?

No, however we are SOC 2 Type II compliant.

What data does Akkio share with OpenAI (GPT)?

Akkio sends the dataset's shape, column names, data types, and typical values for each column to GPT. Specifically we use the commercial API and decline info sharing. Details on the OpenAI API policy can be found on their website. Linked Here

Does Akkio support having a TLS tunnel between a data store and a AWS instance of Akkio?

Yes we do, and we inherit Amazon or Google security.

Is Akkio GDPR Compliant?

This is coming very soon, if this is a requirement for your business please reach out to support.

About Akkio modeling and predictions

What types of statistical techniques are these models using to make predictions?

We use several modeling methods including Neural Networks, Random Forests, and Decision Trees. Those are describe as such:
Neural networks model complex input-target relationships using linear and non-linear transformations, optimized by gradient descent.
Random forests use bagging and feature randomness to combine the outputs of multiple decision trees for higher accuracy and reduced overfitting.
Decision trees recursively split input data based on feature values, aiming for homogeneous target variable subsets, determined by techniques like entropy, Gini impurity, or information gain.

How do we know what assumptions the model is making, if it's generalizable, or even statistically significant?

Different algorithms make varying assumptions about the data distribution. Non-parametric models like decision trees and random forests make fewer assumptions, while neural networks assume differentiability in input-target relationships. Though statistical significance isn't directly evaluated, performance metrics like accuracy, precision, recall, and F1 score can be used to assess a model's effectiveness.

How do we handle multicollinearity and singularity and outliers?

Multicollinearity is addressed within the platform to help remove redundant features and improve model performance.
Singularity, often caused by a high degree of correlation between features or perfect collinearity, can be resolved by removing one of the collinear features.
We are generally robust to outliers, but if necessary, they can be removed with chat data prep or the soon-to-be-launched data cleaning tool. Some models, like decision trees and random forests, are less sensitive to outliers than others.

Are the modeling processes transparent?

We provide insight into the driving factors for all models as part of the model creation process. We call this the insights report, and it works to make the model decision-making more transparent.
The goal of our platform is to provide ML capabilities without need for code so the transparency comes in these reports in digestible form. More detail can be found by drilling into the advanced sections of the report.

API

What is the API response time/volumes it can handle?

Five requests per second. These requests can be bulk calls however which makes the API handling more case by case. Feel free to reach out to support for your specific use case.