」工欲善其事,必先利其器。「—孔子《論語.錄靈公》
首頁 > 程式設計 > 在 Python 中使用 Pydantic 的最佳實踐

在 Python 中使用 Pydantic 的最佳實踐

發佈於2024-08-06
瀏覽:658

Best Practices for Using Pydantic in Python

Pydantic is a Python library that simplifies data validation using type hints. It ensures data integrity and offers an easy way to create data models with automatic type checking and validation.

In software applications, reliable data validation is crucial to prevent errors, security issues, and unpredictable behavior.

This guide provides best practices for using Pydantic in Python projects, covering model definition, data validation, error handling, and performance optimization.


Installing Pydantic

To install Pydantic, use pip, the Python package installer, with the command:

pip install pydantic

This command installs Pydantic and its dependencies.

Basic Usage

Create Pydantic models by making classes that inherit from BaseModel. Use Python type annotations to specify each field's type:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

Pydantic supports various field types, including int, str, float, bool, list, and dict. You can also define nested models and custom types:

from typing import List, Optional
from pydantic import BaseModel

class Address(BaseModel):
    street: str
    city: str
    zip_code: Optional[str] = None

class User(BaseModel):
    id: int
    name: str
    email: str
    age: Optional[int] = None
    addresses: List[Address]

Once you've defined a Pydantic model, create instances by providing the required data. Pydantic will validate the data and raise errors if any field doesn't meet the specified requirements:

user = User(
    id=1,
    name="John Doe",
    email="[email protected]",
    addresses=[{"street": "123 Main St", "city": "Anytown", "zip_code": "12345"}]
)

print(user)

# Output:
# id=1 name='John Doe' email='[email protected]' age=None addresses=[Address(street='123 Main St', city='Anytown', zip_code='12345')]

Defining Pydantic Models

Pydantic models use Python type annotations to define data field types.

They support various built-in types, including:

  • Primitive types: str, int, float, bool
  • Collection types: list, tuple, set, dict
  • Optional types: Optional from the typing module for fields that can be None
  • Union types: Union from the typing module to specify a field can be one of several types

Example:

from typing import List, Dict, Optional, Union
from pydantic import BaseModel

class Item(BaseModel):
    name: str
    price: float
    tags: List[str]
    metadata: Dict[str, Union[str, int, float]]

class Order(BaseModel):
    order_id: int
    items: List[Item]
    discount: Optional[float] = None

Custom Types

In addition to built-in types, you can define custom types using Pydantic's conint, constr, and other constraint functions.

These allow you to add additional validation rules, such as length constraints on strings or value ranges for integers.

Example:

from pydantic import BaseModel, conint, constr

class Product(BaseModel):
    name: constr(min_length=2, max_length=50)
    quantity: conint(gt=0, le=1000)
    price: float

product = Product(name="Laptop", quantity=5, price=999.99)

Required vs. Optional Fields

By default, fields in a Pydantic model are required unless explicitly marked as optional.

If a required field is missing during model instantiation, Pydantic will raise a ValidationError.

Example:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

user = User(id=1, name="John Doe")


# Output
#  Field required [type=missing, input_value={'id': 1, 'name': 'John Doe'}, input_type=dict]

Optional Fields with Default Values

Fields can be made optional by using Optional from the typing module and providing a default value.

Example:

from pydantic import BaseModel
from typing import Optional

class User(BaseModel):
    id: int
    name: str
    email: Optional[str] = None

user = User(id=1, name="John Doe")

In this example, email is optional and defaults to None if not provided.

Nested Models

Pydantic allows models to be nested within each other, enabling complex data structures.

Nested models are defined as fields of other models, ensuring data integrity and validation at multiple levels.

Example:

from pydantic import BaseModel
from typing import Optional, List


class Address(BaseModel):
    street: str
    city: str
    zip_code: Optional[str] = None

class User(BaseModel):
    id: int
    name: str
    email: str
    addresses: List[Address]

user = User(
    id=1,
    name="John Doe",
    email="[email protected]",
    addresses=[{"street": "123 Main St", "city": "Anytown"}]
)

Best Practices for Managing Nested Data

When working with nested models, it's important to:

  • Validate data at each level: Ensure each nested model has its own validation rules and constraints.
  • Use clear and consistent naming conventions: This makes the structure of your data more readable and maintainable.
  • Keep models simple: Avoid overly complex nested structures. If a model becomes too complex, consider breaking it down into smaller, more manageable components.

Data Validation

Pydantic includes a set of built-in validators that handle common data validation tasks automatically.

These validators include:

  • Type validation: Ensures fields match the specified type annotations (e.g., int, str, list).
  • Range validation: Enforces value ranges and lengths using constraints like conint, constr, confloat.
  • Format validation: Checks specific formats, such as EmailStr for validating email addresses.
  • Collection validation: Ensures elements within collections (e.g., list, dict) conform to specified types and constraints.

These validators simplify the process of ensuring data integrity and conformity within your models.

Here are some examples demonstrating built-in validators:

from pydantic import BaseModel, EmailStr, conint, constr

class User(BaseModel):
    id: conint(gt=0)  # id must be greater than 0
    name: constr(min_length=2, max_length=50)  # name must be between 2 and 50 characters
    email: EmailStr  # email must be a valid email address
    age: conint(ge=18)  # age must be 18 or older

user = User(id=1, name="John Doe", email="[email protected]", age=25)

In this example, the User model uses built-in validators to ensure the id is greater than 0, the name is between 2 and 50 characters, the email is a valid email address, and the age is 18 or older.
To be able to use the email validator, you need to install an extension to pydantic:

pip install pydantic[email]

Custom Validators

Pydantic allows you to define custom validators for more complex validation logic.

Custom validators are defined using the @field_validator decorator within your model class.

Example of a custom validator:

from pydantic import BaseModel, field_validator


class Product(BaseModel):
    name: str
    price: float

    @field_validator('price')
    def price_must_be_positive(cls, value):
        if value 



Here, the price_must_be_positive validator ensures that the price field is a positive number.

Custom validators are registered automatically when you define them within a model using the @field_validator decorator. Validators can be applied to individual fields or across multiple fields.

Example of registering a validator for multiple fields:

from pydantic import BaseModel, field_validator


class Person(BaseModel):
    first_name: str
    last_name: str

    @field_validator('first_name', 'last_name')
    def names_cannot_be_empty(cls, value):
        if not value:
            raise ValueError('Name fields cannot be empty')
        return value

person = Person(first_name="John", last_name="Doe")

In this example, the names_cannot_be_empty validator ensures that both the first_name and last_name fields are not empty.

Using Config Classes

Pydantic models can be customized using an inner Config class.

This class allows you to set various configuration options that affect the model's behavior, such as validation rules, JSON serialization, and more.

Example of a Config class:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

    class Config:
        str_strip_whitespace = True  # Strip whitespace from strings
        str_min_length = 1  # Minimum length for any string field

user = User(id=1, name="  John Doe  ", email="[email protected]")

print(user)

# Output:
# id=1 name='John Doe' email='[email protected]'

In this example, the Config class is used to strip whitespace from string fields and enforce a minimum length of 1 for any string field.

Some common configuration options in Pydantic's Config class include:

  • str_strip_whitespace: Automatically strip leading and trailing whitespace from string fields.
  • str_min_length: Set a minimum length for any string field.
  • validate_default: Validate all fields, even those with default values.
  • validate_assignment: Enable validation on assignment to model attributes.
  • use_enum_values: Use the values of enums directly instead of the enum instances.
  • json_encoders: Define custom JSON encoders for specific types.

Error Handling

When Pydantic finds data that doesn't conform to the model's schema, it raises a ValidationError.

This error provides detailed information about the issue, including the field name, the incorrect value, and a description of the problem.

Here's an example of how default error messages are structured:

from pydantic import BaseModel, ValidationError, EmailStr

class User(BaseModel):
    id: int
    name: str
    email: EmailStr

try:
    user = User(id='one', name='John Doe', email='invalid-email')
except ValidationError as e:
    print(e.json())

# Output:
# [{"type":"int_parsing","loc":["id"],"msg":"Input should be a valid integer, unable to parse string as an integer","input":"one","url":"https://errors.pydantic.dev/2.8/v/int_parsing"},{"type":"value_error","loc":["email"],"msg":"value is not a valid email address: An email address must have an @-sign.","input":"invalid-email","ctx":{"reason":"An email address must have an @-sign."},"url":"https://errors.pydantic.dev/2.8/v/value_error"}]

In this example, the error message will indicate that id must be an integer and email must be a valid email address.

Customizing Error Messages

Pydantic allows you to customize error messages for specific fields by raising exceptions with custom messages in validators or by setting custom configurations.

Here’s an example of customizing error messages:

from pydantic import BaseModel, ValidationError, field_validator

class Product(BaseModel):
    name: str
    price: float

    @field_validator('price')
    def price_must_be_positive(cls, value):
        if value 



In this example, the error message for price is customized to indicate that it must be a positive number.

Best Practices for Error Reporting

Effective error reporting involves providing clear, concise, and actionable feedback to users or developers.

Here are some best practices:

  • Log errors: Use logging mechanisms to record validation errors for debugging and monitoring purposes.
  • Return user-friendly messages: When exposing errors to end-users, avoid technical jargon. Instead, provide clear instructions on how to correct the data.
  • Aggregate errors: When multiple fields are invalid, aggregate the errors into a single response to help users correct all issues at once.
  • Use consistent formats: Ensure that error messages follow a consistent format across the application for easier processing and understanding.

Examples of best practices in error reporting:

from pydantic import BaseModel, ValidationError, EmailStr
import logging

logging.basicConfig(level=logging.INFO)

class User(BaseModel):
    id: int
    name: str
    email: EmailStr

def create_user(data):
    try:
        user = User(**data)
        return user
    except ValidationError as e:
        logging.error("Validation error: %s", e.json())
        return {"error": "Invalid data provided", "details": e.errors()}

user_data = {'id': 'one', 'name': 'John Doe', 'email': 'invalid-email'}
response = create_user(user_data)
print(response)

# Output:
# ERROR:root:Validation error: [{"type":"int_parsing","loc":["id"],"msg":"Input should be a valid integer, unable to parse string as an integer","input":"one","url":"https://errors.pydantic.dev/2.8/v/int_parsing"},{"type":"value_error","loc":["email"],"msg":"value is not a valid email address: An email address must have an @-sign.","input":"invalid-email","ctx":{"reason":"An email address must have an @-sign."},"url":"https://errors.pydantic.dev/2.8/v/value_error"}]
# {'error': 'Invalid data provided', 'details': [{'type': 'int_parsing', 'loc': ('id',), 'msg': 'Input should be a valid integer, unable to parse string as an integer', 'input': 'one', 'url': 'https://errors.pydantic.dev/2.8/v/int_parsing'}, {'type': 'value_error', 'loc': ('email',), 'msg': 'value is not a valid email address: An email address must have an @-sign.', 'input': 'invalid-email', 'ctx': {'reason': 'An email address must have an @-sign.'}}]}

In this example, validation errors are logged, and a user-friendly error message is returned, helping maintain application stability and providing useful feedback to the user.


Performance Considerations

Lazy initialization is a technique that postpones the creation of an object until it is needed.

In Pydantic, this can be useful for models with fields that are costly to compute or fetch. By delaying the initialization of these fields, you can reduce the initial load time and improve performance.

Example of lazy initialization:

from pydantic import BaseModel
from functools import lru_cache

class DataModel(BaseModel):
    name: str
    expensive_computation: str = None

    @property
    @lru_cache(maxsize=1)
    def expensive_computation(self):
        # Simulate an expensive computation
        result = "Computed Value"
        return result

data_model = DataModel(name="Test")
print(data_model.expensive_computation)

In this example, the expensive_computation field is computed only when accessed for the first time, reducing unnecessary computations during model initialization.

Redundant Validation

Pydantic models automatically validate data during initialization.

However, if you know that certain data has already been validated or if validation is not necessary in some contexts, you can disable validation to improve performance.

This can be done using the model_construct method, which bypasses validation:

Example of avoiding redundant validation:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

# Constructing a User instance without validation
data = {'id': 1, 'name': 'John Doe', 'email': '[email protected]'}
user = User.model_construct(**data)

In this example, User.model_construct is used to create a User instance without triggering validation, which can be useful in performance-critical sections of your code.

Efficient Data Parsing

When dealing with large datasets or high-throughput systems, efficiently parsing raw data becomes critical.

Pydantic provides the model_validate_json method, which can be used to parse JSON or other serialized data formats directly into Pydantic models.

Example of efficient data parsing:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

json_data = '{"id": 1, "name": "John Doe", "email": "[email protected]"}'
user = User.model_validate_json(json_data)
print(user)

In this example, model_validate_json is used to parse JSON data into a User model directly, providing a more efficient way to handle serialized data.

Controlling Validation

Pydantic models can be configured to validate data only when necessary.

The validate_default and validate_assignment options in the Config class control when validation occurs, which can help improve performance:

  • validate_default: When set to False, only fields that are set during initialization are validated.
  • validate_assignment: When set to True, validation is performed on field assignment after the model is created.

Example configuration:

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

    class Config:
        validate_default = False  # Only validate fields set during initialization
        validate_assignment = True  # Validate fields on assignment

user = User(id=1, name="John Doe", email="[email protected]")
user.email = "[email protected]"  # This assignment will trigger validation

In this example, validate_default is set to False to avoid unnecessary validation during initialization, and validate_assignment is set to True to ensure that fields are validated when they are updated.


Settings Management

Pydantic's BaseSettings class is designed for managing application settings, supporting environment variable loading and type validation.

This helps in configuring applications for different environments (e.g., development, testing, production).

Consider this .env file:

database_url=db
secret_key=sk
debug=False

Example of using BaseSettings:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    database_url: str
    secret_key: str
    debug: bool = False

    class Config:
        env_file = ".env"

settings = Settings()
print(settings.model_dump())

# Output:
# {'database_url': 'db', 'secret_key': 'sk', 'debug': False}

In this example, settings are loaded from environment variables, and the Config class specifies that variables can be loaded from a .env file.

For using BaseSettings you will need to install an additional package:

pip install pydantic-settings

Managing settings effectively involves a few best practices:

  • Use environment variables: Store configuration values in environment variables to keep sensitive data out of your codebase.
  • Provide defaults: Define sensible default values for configuration settings to ensure the application runs with minimal configuration.
  • Separate environments: Use different configuration files or environment variables for different environments (e.g., .env.development, .env.production).
  • Validate settings: Use Pydantic's validation features to ensure all settings are correctly typed and within acceptable ranges.

Common Pitfalls and How to Avoid Them

One common mistake when using Pydantic is misapplying type annotations, which can lead to validation errors or unexpected behavior.

Here are a few typical mistakes and their solutions:

  • Misusing Union Types: Using Union incorrectly can complicate type validation and handling.
  • Optional Fields without Default Values: Forgetting to provide a default value for optional fields can lead to None values causing errors in your application.
  • Incorrect Type Annotations: Assigning incorrect types to fields can cause validation to fail. For example, using str for a field that should be an int.

Ignoring Performance Implications

Ignoring performance implications when using Pydantic can lead to slow applications, especially when dealing with large datasets or frequent model instantiations.

Here are some strategies to avoid performance bottlenecks:

  • Leverage Configuration Options: Use Pydantic's configuration options like validate_default and validate_assignment to control when validation occurs.
  • Optimize Nested Models: When working with nested models, ensure that you are not over-validating or duplicating validation logic.
  • Use Efficient Parsing Methods: Utilize model_validate_json and model_validate for efficient data parsing.
  • Avoid Unnecessary Validation: Use the model_construct method to create models without validation when the data is already known to be valid.

Overcomplicating Models

Overcomplicating Pydantic models can make them difficult to maintain and understand.

Here are some tips to keep models simple and maintainable:

  • Document Your Models: Use docstrings and comments to explain complex validation rules or business logic embedded in models.
  • Encapsulate Logic Appropriately: Keep validation and business logic within appropriate model methods or external utilities to avoid cluttering model definitions.
  • Use Inheritance Sparingly: While inheritance can promote code reuse, excessive use can make the model hierarchy complex and harder to follow.
  • Avoid Excessive Nesting: Deeply nested models can be hard to manage. Aim for a balanced level of nesting.

Conclusion

In this guide, we have covered various best practices for using Pydantic effectively in your Python projects.

We began with the basics of getting started with Pydantic, including installation, basic usage, and defining models. We then delved into advanced features like custom types, serialization and deserialization, and settings management.

Key performance considerations, such as optimizing model initialization and efficient data parsing, were highlighted to ensure your applications run smoothly.

We also discussed common pitfalls, such as misusing type annotations, ignoring performance implications, and overcomplicating models, and provided strategies to avoid them.

Applying these best practices in your real-world projects will help you leverage the full power of Pydantic, making your code more robust, maintainable, and performant.

版本聲明 本文轉載於:https://dev.to/devasservice/best-practices-for-using-pydantic-in-python-2021?1如有侵犯,請聯絡[email protected]刪除
最新教學 更多>
  • 在 Golang 中建立 Google Drive 下載器(第 1 部分)
    在 Golang 中建立 Google Drive 下載器(第 1 部分)
    介绍 在本教程中,我们将构建一个功能强大的下载器,允许从Google Drive和其他云提供商下载文件。借助 Golang 高效的并发模式,您将能够同时管理多个下载、流式传输大文件并实时跟踪进度。无论您是下载一些小文件还是处理大型数据集,该项目都将展示如何构建可扩展且强大的下载器,...
    程式設計 發佈於2024-11-08
  • PHP 4 快速部署
    PHP 4 快速部署
    Servbay 已成為高效配置開發環境的領先工具。在本指南中,我們將引導您完成快速、安全地部署 PHP 8.1 的過程,以展示 Servbay 對簡化部署的承諾。 先決條件 確保您的電腦上安裝了 Servbay。您可以從 Servbay 官方網站輕鬆下載。安裝過程人性化;只需按照安...
    程式設計 發佈於2024-11-08
  • 如何繞過驗證碼
    如何繞過驗證碼
    No matter how many times people wrote that the captcha has outlived itself long time ago and no longer works as effectively as its developers would ha...
    程式設計 發佈於2024-11-08
  • 使用 super 呼叫超類別建構函數
    使用 super 呼叫超類別建構函數
    子類別可以使用 super(parameter-list);. 形式來呼叫其超類別定義的建構函數 parameter-list 必須指定超類別建構子所需的參數。 子類別建構子中執行的第一條語句必須始終是 super(); (或 super(parameter-list); 如果需要傳遞參數). ...
    程式設計 發佈於2024-11-08
  • 你能比較 C++ 中不同容器的迭代器嗎?
    你能比較 C++ 中不同容器的迭代器嗎?
    比較來自不同容器的迭代器:一個警示故事在C 中,迭代器提供了一個強大的遍歷集合的機制。然而,在使用來自不同容器的迭代器時,重要的是要意識到這些限制。 比較不同容器的迭代器是否合法的問題經常出現。考慮以下範例:std::vector<int> foo; std::vector<int...
    程式設計 發佈於2024-11-08
  • 幫助 FastAPI:如何為文件翻譯做出貢獻
    幫助 FastAPI:如何為文件翻譯做出貢獻
    One of the great features of FastAPI is its great documentation ?. But wouldn't it be better if more people around the world had access to this docume...
    程式設計 發佈於2024-11-08
  • 如何使用 CSS 和 AngularJS 建立垂直 HTML 表格?
    如何使用 CSS 和 AngularJS 建立垂直 HTML 表格?
    垂直HTML 表格創建具有垂直行的HTML 表格提供了一種獨特的方式來顯示數據,行標題位於左側而不是頂部。要實現此目的,可以套用 CSS 樣式來轉換表格的結構。 CSS 樣式若要將表格行呈現為垂直列,請遵循下列CSS 規則可使用:tr { display: block; float: lef...
    程式設計 發佈於2024-11-08
  • 透過自訂 Hooks 在 React 中重複使用邏輯:實用指南
    透過自訂 Hooks 在 React 中重複使用邏輯:實用指南
    自訂鉤子是React 中的一項強大功能,與React 內建鉤子不同,它用於更具體的目的,並且它是透過將常見功能封裝到獨立函數中來完成的。自訂掛鉤促進可重複使用性、改進元件組織並整體增強程式碼可維護性。 在本指南中,我們將深入探討使用自訂鉤子的目的,以了解創建自訂鉤子的基礎知識以及如何使用其他元件。...
    程式設計 發佈於2024-11-08
  • 使用 ReactJS 建立免費的 AI 圖像生成器
    使用 ReactJS 建立免費的 AI 圖像生成器
    开发者们大家好, 今天,我将向您展示如何使用 ReactJS 创建图像生成器,并且完全可以免费使用,这要感谢黑森林实验室和 Together AI。 第 1 步:设置项目 在本教程中,我们将使用 Vite 来初始化应用程序并使用 Shadcn 来初始化 UI。我假设您已经设置了项目并...
    程式設計 發佈於2024-11-08
  • 字串中的串聯或大括號:哪種方法可以優化效能和美觀?
    字串中的串聯或大括號:哪種方法可以優化效能和美觀?
    字串中的變數連結與大括號:評估效能與美觀在字串操作領域,開發人員經常面臨兩難境地:他們應該連接字串中的變數還是選擇花括號?每種方法都有自己的優點和缺點,我們將深入研究這些優點和缺點,以提供明智的決策。 串聯:傳統方法串聯涉及使用以下方法將變數附加到字串這 '。 '操作員。雖然這種方法...
    程式設計 發佈於2024-11-08
  • 我嘗試過花崗岩。
    我嘗試過花崗岩。
    花岗岩3.0 Granite 3.0 是一个开源、轻量级的生成语言模型系列,专为一系列企业级任务而设计。它原生支持多语言功能、编码、推理和工具使用,使其适合企业环境。 我测试了运行这个模型,看看它可以处理哪些任务。 环境设置 我在Google Colab中设置了Gr...
    程式設計 發佈於2024-11-08
  • 掌握 JavaScript 函數:開發人員綜合指南
    掌握 JavaScript 函數:開發人員綜合指南
    JavaScript Functions A JavaScript function is a block of code designed to perform a particular task. A JavaScript function is executed when "...
    程式設計 發佈於2024-11-08
  • Go 中的機率提前過期
    Go 中的機率提前過期
    关于缓存踩踏 我经常遇到需要缓存这个或那个的情况。通常,这些值会被缓存一段时间。您可能熟悉这种模式。您尝试从缓存中获取一个值,如果成功,则将其返回给调用者并结束。如果该值不存在,您将获取它(很可能从数据库中)或计算它并将其放入缓存中。在大多数情况下,这非常有效。但是,如果您用于缓存...
    程式設計 發佈於2024-11-08
  • Next.js 快取:透過高效的資料獲取來增強您的應用程式
    Next.js 快取:透過高效的資料獲取來增強您的應用程式
    Next.js 中的快取不僅是為了節省時間,還在於減少冗餘網路請求、保持資料新鮮並使您的應用程式像搖滾明星一樣運作。 無論您是想將資料快取更長時間還是按需刷新,Next.js 都能為您提供所需的所有工具。在本文中,我們將詳細介紹如何在 Next.js 中有效地使用快取 Next.js 擴充了 fe...
    程式設計 發佈於2024-11-08
  • 為什麼我的 Go 模板條件檢查失敗?
    為什麼我的 Go 模板條件檢查失敗?
    Go 範本:條件檢查故障排除在 Go 範本渲染中,結構體欄位的條件檢查有時無法如預期運作。考慮以下範例,其中 bool 欄位 isOrientRight 未正確計算:type Category struct { ImageURL string
    程式設計 發佈於2024-11-08

免責聲明: 提供的所有資源部分來自互聯網,如果有侵犯您的版權或其他權益,請說明詳細緣由並提供版權或權益證明然後發到郵箱:[email protected] 我們會在第一時間內為您處理。

Copyright© 2022 湘ICP备2022001581号-3