Application Servers

Your application's backend code and Docker containers run on sherpa.sh's application servers. These servers sit behind our CDN and load balancer within our Kubernetes cluster in your selected region, providing a robust execution environment for your applications.

How It Works

When you deploy your application, sherpa.sh runs your backend code or Docker containers on our application servers. These servers handle all the compute-intensive work while our CDN and load balancer manage traffic distribution and static content delivery.

Architecture Flow:

User request → CDN (static files) or Load Balancer (dynamic content)
Load Balancer → Application Server instances
Application Servers → Your backend code/containers
Response flows back through the same path

For detailed architecture information, see our Architecture Overview page.

Default Resource Configuration

Every application server gets deployed as a swarm of docker containers behind a loadbalancer. There is a maximum resource allocation per individual container.

Resource Allocation

# Per instance limits
CPU: 1 core maximum
Memory: 1GB maximum

Auto-scaling Behavior

The amount of replicas of your containers that get created depends on your plan selected.

# Hobby and Starter Plan Settings
Minimum instances: 1
Maximum instances: 5
CPU scaling threshold: 80%

How scaling works: When your backend code uses more than 80% CPU across instances, new application servers automatically spin up to handle the load.

If you need more instance replicas or different CPU thresholdes, reach out to support in discord.sherpa.sh.

Infrastructure Types

Shared Application Servers (Default)

What you get: Your containers/code run on shared infrastructure alongside other applications.

Benefits:

Zero configuration: Deploy immediately
Automatic load distribution: Traffic spreads across instances
Cost-effective: Shared infrastructure costs
Built-in monitoring: Performance metrics included

Limitations:

Ephemeral storage: No persistent file writes
Shared resources: CPU/memory shared with other applications
Standard resource limits: Fixed allocation per instance

Best for: Stateless APIs, web applications, microservices

Dedicated Application Servers

What you get: Exclusive physical servers running only your application containers.

Available Configurations:

Compute: 2-96 CPU cores, 4-256GB memory
Storage: 80GB-300TB persistent disk
Network: 1-10Gbps dedicated bandwidth
Transfer: 20TB monthly included

Benefits:

Guaranteed performance: No resource contention
Persistent storage: File system writes supported
Custom sizing: Tailored to your workload
Isolation: Enhanced security and performance

Best for: Databases, file processing, high-traffic applications

Regional Deployment

Your application servers run in the region you select during deployment. View our available regions.

Benefits of regional deployment:

Reduced latency: Servers closer to your users
Data compliance: Meet regional data requirements
Improved performance: Faster database connections

Managing Application Servers

Viewing Server Status

Navigate to your application dashboard
Go to Resources > Application Server
Monitor instance count, CPU usage, and memory consumption

Requesting Dedicated Servers

Visit discord.sherpa.sh
Create support ticket with requirements:
- Expected traffic volume
- Resource requirements (CPU/memory)
- Storage needs
- Performance requirements

Best Practices

Stateless Design

Design your application to work seamlessly across multiple server instances by avoiding in-memory state storage.

Good: External State Management

import { getUser } from '../../../lib/database';

export default async function handler(req, res) {
  const { id } = req.query;

  // Fetch from external database, not server memory
  const user = await getUser(id);

  if (!user) {
    return res.status(404).json({ error: 'User not found' });
  }

  res.status(200).json(user);
}

Avoid: In-Memory State

// Don't do this - data lost during scaling
let userCache = {}; // Lost when new instances start

export default async function handler(req, res) {
  const { id } = req.query;

  if (!userCache[id]) {
    userCache[id] = await getUser(id); // Won't persist across instances
  }

  res.status(200).json(userCache[id]);
}

Health Check Implementation

Sherpa.sh automatically checks your application health by requesting the root URL (/). Ensure this endpoint returns a valid response.

Required Health Check Setup

// pages/index.js or app/page.js (App Router)
export default function Home() {
  return (
    <div>
      <h1>Application Status: Healthy</h1>
      <p>Server is running normally</p>
    </div>
  );
}

// Or for API-only applications
// pages/index.js
export default function handler(req, res) {
  res.status(200).json({
    status: 'healthy',
    timestamp: new Date().toISOString(),
    version: process.env.npm_package_version || '1.0.0'
  });
}

Advanced Health Check with Dependencies

import { checkDatabase } from '../lib/database';
import { checkExternalAPI } from '../lib/external-services';

export default async function handler(req, res) {
  try {
    // Check critical dependencies
    await checkDatabase();
    await checkExternalAPI();

    res.status(200).json({
      status: 'healthy',
      checks: {
        database: 'connected',
        external_api: 'responding'
      },
      timestamp: new Date().toISOString()
    });
  } catch (error) {
    res.status(503).json({
      status: 'unhealthy',
      error: error.message,
      timestamp: new Date().toISOString()
    });
  }
}

Efficient Resource Usage

Optimize your application for auto-scaling by implementing efficient async patterns and resource management.

Database Connection Management

import { Pool } from 'pg';

// Use connection pooling for database efficiency
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20, // Maximum connections
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

export async function queryDatabase(query, params) {
  const client = await pool.connect();
  try {
    const result = await client.query(query, params);
    return result.rows;
  } finally {
    client.release(); // Always release connection
  }
}

Async API Route Optimization

export default async function handler(req, res) {
  const { query, category } = req.query;

  try {
    // Run parallel requests for better performance
    const [products, categories, recommendations] = await Promise.all([
      searchProducts(query),
      getCategories(category),
      getRecommendations(query)
    ]);

    res.status(200).json({
      products,
      categories,
      recommendations
    });
  } catch (error) {
    console.error('Search error:', error);
    res.status(500).json({ error: 'Search failed' });
  }
}

Monitoring and Optimization

Performance Metrics

CPU utilization: Monitor for scaling triggers
Memory usage: Track memory leaks and optimization opportunities
Response time: Measure application performance
Instance count: Understand scaling patterns

Optimization Strategies

Database optimization: Use connection pooling and query optimization
Caching: Implement Redis or in-memory caching
Async processing: Use background tasks for heavy operations
Resource monitoring: Set up alerts for resource usage

Next Steps

Review architecture: Read our Architecture Overview for complete system understanding
Set up monitoring: Configure application performance monitoring
Implement caching: Add Redis or similar caching layer
Plan scaling: Consider dedicated servers for growing applications
Optimize performance: Profile your application for bottlenecks

Need help optimizing your application server setup? Our support team provides detailed performance analysis and recommendations for your specific use case.