ECM offers a fully managed OpenAI service offering for its customers. This offering is made of 2 Azure services:
- Azure Open AI
- Azure API Management
Services provided by Azure API management to secure, load balance and ensure connected applications are able to access the Open AI service in a predictable fashion.
API Management provides several important features for ensuring availability of the OpenAI Service
- API Access control
- API Versioning Control
- Load balancing
- Round Robin
- Priority Based routing
- Circuit breaking functionality
- Content Filtering
- Semantic Caching
- Token Metrics and Rate Limiting
LikeBe the first to like this
OpenAI offers a range of services and products that leverage artificial intelligence and natural language processing technologies. Azure OpenAI Service provides REST API access to OpenAI's powerful language models.
Service Highlights
Feature | Azure OpenAI Gov | Azure OpenAI Global |
Authorization | FedRAMP High | Various (e.g., GDPR, HIPAA) |
Data Centers | US-based | Global (multiple regions) |
Compliance | FISMA, DFARS | GDPR, HIPAA, etc. |
Support | Dedicated government support | Priority business support |
Availability | US-only | Global |
Language Support | English | Multi-language support |
Scalability | Designed for government agencies | Designed for all business sizes |
Use Cases
- Large Language Models (LLM)
- Conversational AI (ChatGPT)
- OpenAI is known for its GPT models, starting from GPT-1 to the latest at the time, GPT-3.5. These models are designed for natural language understanding and generation tasks, such as text completion, translation, summarization, and more.
- Content Creation (DALL-E)
- A model focused on generating images from textual descriptions. DALL-E is capable of creating novel images based on user-provided prompts, expanding the applications of AI into creative visual domains.
- CLIP (Contrastive Language-Image Pre-training)
- A model capable of understanding and generating images from text prompts by learning visual concepts from a vast dataset of images and associated text.
- API Access (for integration into new or existing applications)
- ChatGPT API
- Provides access to the GPT models for developers to integrate into their applications, allowing for conversational AI capabilities.
- Codex API
- Tailored for coding tasks, using a model specifically trained on programming languages and code completion.
- OpenAI Gym
- A toolkit for developing and comparing reinforcement learning algorithms. It provides a wide variety of environments (simulators) where AI agents can be trained and tested.
Things to know
- Microsoft is moving towards primary model deployment and model management being done From Azure AI Studio. This resource is created with the deployment if the Azure Open AI service.
- OpenAI Service is continually being update, make sure to check What's new in Azure OpenAI Service? - Azure AI services | Microsoft Learn to review new features and functionality
- Features released to Azure US Government are normally delayed by several months.
Model options
Deployment Options
- Standard - The basic pay as you go option that can be deployed from the Azure Portal or through automation
- Provisioned - Throughput units are pre-purchased capacity that you pay for ahead of time.
- Global - Leverages Azure's global infrastructure, dynamically route customer traffic to the data center with best availability for the customer’s inference requests. Three offerings:
- Batch - Offline scoring Workloads that are not latency sensitive and can be completed in hours. For use cases that do not have data processing residency requirements.
- Standard - Same as the standard deployment but using the global infrastructure
- Provisioned - Same as Provisioned but using the global infrastructure.
- Global deployments will be the first location for all new models and features
Compliance and Standards
- OpenAI adheres to industry standards and regulations concerning data privacy and security. They ensure compliance with frameworks like GDPR (General Data Protection Regulation) and follow best practices in cybersecurity.
- End-to-end encryption and comply with various regulations and standards, such as GDPR, HIPAA, and ISO 27001.
- No data, whether prompts, outputs, embeddings, or training data, are shared with other customers or OpenAI, and they are by default not used to improve Microsoft’s AI offerings.
Security and Monitoring
- Azure OpenAI logs can be sent to both Log analytics workspace and EventHub.
- Customer Managed Keys are supported in both Global and Government Clouds
Required RBAC Permissions
The following actions are the minimum required to deploy the OpenAI service and a model deployment
Microsoft.CognitiveServices/*/read
Microsoft.CognitiveServices/accounts/write
Microsoft.CognitiveServices/accounts/deployments/*
Required Info
- VNet\Subnet that the Private Endpoint will be deployed to
- Models that should be deployed
Pricing Model
- There are 2 pricing structures for Azure OpenAI services, pay-as-you-go and provisioned throughput units (PTU)
- Provisioned Throughput units are pre-purchased capacity that you pay for ahead of time.
- Pay-as-you-go accrues cost as the service is used.
- Additional Costs related to using OpenAI may include:
- Diagnostic Logging
- Private Endpoint traffic
- Monitoring Alerts
OpenAI offers a variety of pricing plans for its ChatGPT and API services to cater to different needs:
ChatGPT Plans:
- Free: Basic access to GPT-3.5, limited access to GPT-4, data analysis, file uploads, vision, web browsing, and custom GPTs.
- Plus ($20/month): Access to GPT-4, early access to new features, and enhanced capabilities like DALL·E image generation and custom GPTs.
- Team ($25/user/month billed annually or $30/user/month billed monthly): Includes everything in Plus, higher message limits, workspace management, and enhanced privacy.
- Enterprise: Customized for larger organizations with unlimited access to GPT-4, enhanced admin controls, and premium support. Pricing is available upon request.
API Pricing:
- GPT-3.5 Turbo: $0.0015 per 1K tokens for outputs and $0.0005 per 1K tokens for inputs.
- GPT-4 Turbo: Significantly lower than GPT-4, with plans to launch GPT-4 Turbo with vision capabilities soon.
- Embedding Models: New models like text-embedding-3-small cost $0.00002 per 1K tokens, and text-embedding-3-large cost $0.00013 per 1K tokens, offering improved performance at reduced prices.
Additionally, OpenAI provides special pricing for educational institutions and nonprofits, offering discounts for broader deployment and accessibility.
For the latest details and updates, you can visit the OpenAI pricing page and the API pricing page on their official website.
Consider using Provisioned Throughput Units (PTUs) for optimal scaling and minimal latency variance.
Service Quotas
Automated Deployment
- Terraform code to simplify the deployment of OpenAI to adhere to ECM standards can be found here: https://gitlab.leidos.com/hs/cloudeng/ecm-azure-terraform-modules/-/tree/main/azure_openai
- Review the README for examples on how to use the module, variables and outputs
- Access to OpenAI's systems and APIs is tightly controlled through authentication mechanisms such as API keys, tokens, and OAuth, ensuring that only authorized users and applications can interact with their services.
Monitoring and Logging:
- OpenAI continuously monitors its systems and services for suspicious activities and potential security breaches. Comprehensive logging and auditing help detect and respond to security incidents promptly.
Compliance and Standards:
- OpenAI adheres to industry standards and regulations concerning data privacy and security. They ensure compliance with frameworks like GDPR (General Data Protection Regulation) and follow best practices in cybersecurity.
Secure Development Practices:
- OpenAI employs secure coding practices and conducts regular security assessments, including vulnerability scanning and penetration testing, to identify and mitigate potential security risks in their software and infrastructure.
Ethical AI Principles:
- Beyond technical security measures, OpenAI emphasizes ethical considerations in AI development. They strive to ensure their AI systems are designed and deployed responsibly, addressing concerns related to bias, fairness, and societal impact.
Collaboration and Transparency:
- OpenAI fosters transparency by engaging with the research community and stakeholders. They collaborate on security research, share findings, and participate in discussions to enhance overall security practices in AI.
Response and Incident Management:
- In the event of a security incident, OpenAI has established incident response protocols to minimize impact and swiftly mitigate vulnerabilities. They communicate transparently with affected parties and take corrective actions to prevent future incidents.
User Education and Awareness:
- OpenAI promotes user education regarding security best practices when using their services. This includes providing guidelines and resources on secure integration and operation of AI technologies.
Availability:
- OpenAI services are available in many regions supported by Azure OpenAI Service. The availability of specific models and services can vary by region, including those supported under the European Common Market (ECM) regulations. Azure OpenAI Service offers various models, such as GPT-3.5, Codex, DALL-E 2, and Whisper, which are accessible in multiple regions worldwide. These regions include major areas in North America, Europe, Asia, and more (MS Learn) (Azure) (GitHub).
- For detailed and up-to-date information on the specific regions where these services are available, you can refer to the Azure OpenAI Service region availability. This page lists all the regions where different OpenAI models can be deployed and used. Additionally, OpenAI's own documentation provides a list of supported countries and regions for its API services, which can be found here (OpenAI Platform).
Compliance Certifications:
Azure OpenAI offers multiple security tiers to cater to your specific compliance needs:
- ISO 27001: These internationally recognized certifications demonstrate Azure OpenAI's adherence to rigorous information security management standards.
- FedRAMP High: With this authorization, government users and developers can integrate Azure OpenAI’s foundation models, such as GPT-4, GPT-3.5, and DALL-E, into their own cloud apps. The service provides high-performance AI models at production scale with industry-leading uptime.
Logging:
OpenAI, like many other technology companies, typically directs diagnostic logs to several logging targets for monitoring and troubleshooting purposes. While specific details can vary based on their current infrastructure and practices, here are some common logging targets where diagnostic logs may be sent:
- Centralized Logging Systems:
- OpenAI likely utilizes centralized logging systems such as Elasticsearch, Splunk, or Graylog. These platforms aggregate logs from various services and components across their infrastructure, providing a unified view for monitoring and analysis.
Cloud Platform Services:
- If OpenAI operates on cloud platforms like Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, they might leverage native logging services provided by these platforms (e.g., CloudWatch, Stackdriver, Azure Monitor). These services offer scalable storage and analysis capabilities for logs generated by cloud-based applications and services.
Application-specific Logging:
- OpenAI's applications and services may have their own logging mechanisms built into the software stack. These logs capture specific events and activities within individual components, aiding in debugging and performance monitoring.
Security Information and Event Management (SIEM) Systems:
- For security-related logs and events, OpenAI might integrate with SIEM systems such as Splunk Enterprise Security, IBM QRadar, or ArcSight. SIEM systems help detect and respond to security incidents by correlating logs from various sources.
Container Orchestration Platforms:
- If OpenAI employs containerized applications managed by platforms like Kubernetes or Docker Swarm, they likely send diagnostic logs to container logging solutions (e.g., Fluentd, Prometheus) for visibility into containerized environments.
Monitoring and Alerting Systems:
- OpenAI uses monitoring tools like Prometheus, Nagios, or Datadog to collect metrics and monitor the health and performance of their services. These tools often integrate with logging solutions to provide comprehensive observability.
Database and Application Performance Logs:
- Logs from databases (e.g., MySQL, PostgreSQL) and application performance monitoring (APM) tools (e.g., New Relic, AppDynamics) are crucial for diagnosing database queries, transactions, and application-level performance issues.
Customized Logging Pipelines:
- OpenAI may have customized logging pipelines tailored to their specific needs and requirements. These pipelines might include data preprocessing, transformation, and enrichment steps before logs are stored or analyzed.
Extensive Security Logging:
- Tracks user activity.
- Logs API calls.
- Monitors data movement.
Enhanced Security Analysis:
- Supports incident analysis.
- Complies with major security standards.
Real-time Threat Detection:
- Provides real-time security alerts.
Advanced Analytics Integration:
- Integrates with Azure Monitor and Power BI for in-depth analysis.
Customizable Security:
- Allows for custom log retention policies.
Bring Your Own Data (BYOD):
- Offers BYOD for enhanced data control.
- Enables tailored security configurations
These logging targets enable OpenAI to maintain operational visibility, troubleshoot issues promptly, monitor performance metrics, and ensure the reliability and security of their AI services and applications. The specific choice of logging targets may evolve over time based on technological advancements and organizational priorities.
Azure OpenAI: Government vs. Global Offerings
This document compares and contrasts the key features of Azure OpenAI Government (Gov) and Azure OpenAI Global offerings.
Target Audience
- Azure OpenAI Gov: Designed specifically for government agencies and organizations in the United States.
- Azure OpenAI Global: Designed for businesses and organizations around the world.
Security and Compliance
Azure OpenAI Gov:
- FedRAMP High Authorization: Meets the highest security and compliance standards for government agencies.
- US-based Data Centers: Ensures sensitive government data remains within the country.
- Compliance: Designed to meet specific government regulations (FISMA, DFARS).
- Dedicated Support: Provides dedicated support for government agencies with priority access to Microsoft's AI experts.
Azure OpenAI Global:
- Global Availability: Data centers located in multiple regions worldwide.
- Compliance: Designed to meet global regulations (GDPR, HIPAA).
- Multi-language Support: Supports various languages.
Feature Comparison Table
Azure Global:
- Azure OpenAI Service is available globally, with a wide range of regions.
- This ensures low latency access for users worldwide. Specific regional availability details can be found on the Microsoft Azure website.
Azure OpenAI Service Models Legacy
- This article lists the retired Azure OpenAI Service models and the date they will be retired.
- Some of the retired models include GPT-3.5, GPT-3, and embedding models.
- The article also gives recommendations for migrating to replacement models.
Azure OpenAI Service Models
- This article discusses different AI models that can be used for various tasks.
- Some of the models can generate different creative text formats, like poems, code, scripts and musical pieces.
- Other models can translate languages or convert speech to text.
- The article also details which regions different models are available in.
Additional Resources
By following these steps, you can deploy and utilize OpenAI services through Azure effectively, leveraging the power of advanced AI models for your applications.
Pricing:
OpenAI offers a variety of pricing plans for its ChatGPT and API services to cater to different needs:
ChatGPT Plans:
- Free: Basic access to GPT-3.5, limited access to GPT-4, data analysis, file uploads, vision, web browsing, and custom GPTs.
- Plus ($20/month): Access to GPT-4, early access to new features, and enhanced capabilities like DALL·E image generation and custom GPTs.
- Team ($25/user/month billed annually or $30/user/month billed monthly): Includes everything in Plus, higher message limits, workspace management, and enhanced privacy.
- Enterprise: Customized for larger organizations with unlimited access to GPT-4, enhanced admin controls, and premium support. Pricing is available upon request.
API Pricing:
- GPT-3.5 Turbo: $0.0015 per 1K tokens for outputs and $0.0005 per 1K tokens for inputs.
- GPT-4 Turbo: Significantly lower than GPT-4, with plans to launch GPT-4 Turbo with vision capabilities soon.
- Embedding Models: New models like text-embedding-3-small cost $0.00002 per 1K tokens, and text-embedding-3-large cost $0.00013 per 1K tokens, offering improved performance at reduced prices.
Additionally, OpenAI provides special pricing for educational institutions and nonprofits, offering discounts for broader deployment and accessibility.
For the latest details and updates, you can visit the OpenAI pricing page and the API pricing page on their official website.
Consider using Provisioned Throughput Units (PTUs) for optimal scaling and minimal latency variance.
Quotas and limits reference
The following sections provide you with a quick guide to the default quotas and limits that apply to Azure OpenAI:
Limit Name | Limit Value |
OpenAI resources per region per Azure subscription | 30 |
Default DALL-E 2 quota limits | 2 concurrent requests |
Default DALL-E 3 quota limits | 2 capacity units (6 requests per minute) |
Maximum prompt tokens per request | Varies per model. For more information, see Azure OpenAI Service models |
Max fine-tuned model deployments | |