FAQ

Frequently asked questions about Kaisar AI Ops.

General

What is Kaisar AI Ops?

Kaisar AI Ops is a unified platform for managing Deep Learning workflows and AI operations, providing tools for experiment tracking, model management, dataset versioning, and production deployment.

Who can use Kaisar AI Ops?

Data scientists, ML engineers, researchers, and teams working on AI/ML projects.

Is there a free tier?

Contact your administrator or sales team for pricing information.

What frameworks are supported?

  • PyTorch

  • TensorFlow

  • Scikit-learn

  • XGBoost

  • Custom frameworks via Docker

Getting Started

How do I get access?

Contact your organization administrator to create an account and assign permissions.

Do I need to install anything?

No, Kaisar AI Ops is web-based. However, you may want to install the CLI tool or SDK for programmatic access.

How do I reset my password?

Click "Forgot Password" on the login page, or contact your administrator.

Experiments

How many experiments can I run simultaneously?

This depends on your organization's quota and available compute resources. Check your quota in the dashboard.

Can I run experiments locally?

Kaisar AI Ops is designed for cloud-based execution, but you can track local experiments using the SDK.

How long are experiment logs retained?

Default retention is 30 days, but this can be configured by administrators.

Can I compare experiments?

Yes, select multiple experiments and click "Compare" to view side-by-side metrics and configurations.

Models

How do I version my models?

Use semantic versioning (e.g., 1.0.0) when registering models. Each version is tracked separately.

What model formats are supported?

  • PyTorch (.pt, .pth)

  • TensorFlow (SavedModel, .h5)

  • ONNX (.onnx)

  • Scikit-learn (pickle)

  • Custom formats

Can I deploy models to production?

Yes, use the Deployments feature to deploy models with auto-scaling and monitoring.

Datasets

What's the maximum dataset size?

This depends on your storage quota. Contact your administrator for limits.

Can I use external storage?

Yes, Kaisar AI Ops supports S3, GCS, and Azure Blob Storage.

How do I version datasets?

Create a new version when uploading updated data. Link experiments to specific dataset versions for reproducibility.

Deployments

How do I deploy a model?

  1. Register your model

  2. Navigate to Deployments

  3. Click "Create Deployment"

  4. Select your model and configure resources

  5. Click "Deploy"

Can deployments auto-scale?

Yes, configure auto-scaling based on CPU usage, request rate, or custom metrics.

How do I monitor deployments?

View real-time metrics in the Deployments section, including request rate, latency, and error rate.

Security & Access

How is authentication handled?

Kaisar AI Ops uses a centralized Identity Provider for authentication, supporting SSO, SAML, and OAuth.

Can I use SSO?

Yes, if configured by your administrator.

How do I create API tokens?

Navigate to Profile → API Tokens → Create Token.

Are API tokens secure?

Yes, but you should:

  • Never commit tokens to version control

  • Rotate tokens regularly

  • Use minimum required permissions

  • Store tokens securely

Billing & Usage

How am I charged?

Charges typically include:

  • Compute hours (GPU/CPU)

  • Storage (datasets, models, logs)

  • API requests

  • Data transfer

Contact your administrator for specific pricing.

How do I monitor usage?

Navigate to Admin → Billing & Usage to view current usage and costs.

Can I set budget alerts?

Yes, administrators can configure budget alerts and quotas.

API & Integration

Is there an API?

Yes, a comprehensive REST API is available. See the API Reference.

Are there SDKs?

Yes, Python and JavaScript/TypeScript SDKs are available.

Can I integrate with CI/CD?

Yes, Kaisar AI Ops supports webhooks and integrations with GitHub Actions, GitLab CI, and Jenkins.

Can I export data?

Yes, you can export experiments, models, and datasets via the UI or API.

Troubleshooting

My experiment is stuck in "pending"

Check:

  • Resource quotas

  • Cluster capacity

  • Experiment configuration

  • Logs for errors

I can't access a resource

Verify:

  • Your permissions

  • Resource sharing settings

  • Organization membership

API requests are failing

Check:

  • Token validity

  • Token permissions

  • Rate limits

  • Request format

Dashboard is slow

Try:

  • Clearing browser cache

  • Reducing displayed items

  • Using a different browser

  • Checking internet connection

Support

How do I get help?

How do I report a bug?

Submit a ticket via the Support Portal with:

  • Steps to reproduce

  • Error messages

  • Screenshots

  • System information

Is there a community?

Yes, join our:

  • GitHub Discussions

  • Stack Overflow (tag: kaisar-ai-ops)

  • Community Forum

Next Steps

Last updated