Python 100 天 - Web 开发 - Peewee] 第 284 天 - Peewee 的扩展 (3) postgreSQL

最编程 2024-03-16 08:43:07

...

文章目录

- - 13.6.3 使用 hstore
  - 13.6.4 间隔支持 Interval support
  - 13.6.5 服务器端游标 Server-side cursors
  - 13.6.6 全文搜索 Full-text search
  - 13.6.7 postgres_ext API 说明
  - - ServerSide
    - class ArrayField
    - class DateTimeTZField
    - class HStoreField
    - class JSONField
    - class BinaryJSONField
    - Match
    - class TSVectorField

13.6.3 使用 hstore

首先，您需要从中导入自定义数据库类和 hstore 函数playhouse.postgres_ext（参见上面的代码片段）。然后，就像HStoreField在模型中添加 a 一样简单：

class House(BaseExtModel):
    address = CharField()
    features = HStoreField()

您现在可以在House实例上存储任意键/值对：

>>> h = House.create(
...     address='123 Main St',
...     features={'garage': '2 cars', 'bath': '2 bath'})
...
>>> h_from_db = House.get(House.id == h.id)
>>> h_from_db.features
{'bath': '2 bath', 'garage': '2 cars'}

您可以按单个键、多个键或部分字典进行过滤：

query = House.select()
garage = query.where(House.features.contains(‘garage’))
garage_and_bath = query.where(House.features.contains([‘garage’, ‘bath’]))
twocar = query.where(House.features.contains({‘garage’: ‘2 cars’}))
假设你想对房子进行原子更新：

>>> new_features = House.features.update({'bath': '2.5 bath', 'sqft': '1100'})
>>> query = House.update(features=new_features)
>>> query.where(House.id == h.id).execute()
1
>>> h = House.get(House.id == h.id)
>>> h.features
{'bath': '2.5 bath', 'garage': '2 cars', 'sqft': '1100'}

或者，或者原子删除：

>>> query = House.update(features=House.features.delete('bath'))
>>> query.where(House.id == h.id).execute()
1
>>> h = House.get(House.id == h.id)
>>> h.features
{'garage': '2 cars', 'sqft': '1100'}

可以同时删除多个键：

>>> query = House.update(features=House.features.delete('garage', 'sqft'))

您可以只选择键、值或压缩两者：

>>> for h in House.select(House.address, House.features.keys().alias('keys')):
...     print(h.address, h.keys)

123 Main St [u'bath', u'garage']

>>> for h in House.select(House.address, House.features.values().alias('vals')):
...     print(h.address, h.vals)

123 Main St [u'2 bath', u'2 cars']

>>> for h in House.select(House.address, House.features.items().alias('mtx')):
...     print(h.address, h.mtx)

123 Main St [[u'bath', u'2 bath'], [u'garage', u'2 cars']]

您可以检索数据切片，例如所有车库数据：

>>> query = House.select(House.address, House.features.slice('garage').alias('garage_data'))
>>> for house in query:
...     print(house.address, house.garage_data)

123 Main St {'garage': '2 cars'}

您可以检查是否存在键并相应地过滤行：

>>> has_garage = House.features.exists('garage')
>>> for house in House.select(House.address, has_garage.alias('has_garage')):
...     print(house.address, house.has_garage)

123 Main St True

>>> for house in House.select().where(House.features.exists('garage')):
...     print(house.address, house.features['garage'])  # <-- just houses w/garage data

123 Main St 2 cars

13.6.4 间隔支持 Interval support

Postgres 通过INTERVAL数据类型（docs）支持持续时间。

class IntervalField([null=False[, ...]])

能够存储 Pythondatetime.timedelta实例的字段类。

例子：

from datetime import timedelta

from playhouse.postgres_ext import *

db = PostgresqlExtDatabase('my_db')

class Event(Model):
    location = CharField()
    duration = IntervalField()
    start_time = DateTimeField()

    class Meta:
        database = db

    @classmethod
    def get_long_meetings(cls):
        return cls.select().where(cls.duration > timedelta(hours=1))

13.6.5 服务器端游标 Server-side cursors

当 psycopg2 执行查询时，通常所有结果都由后端获取并返回给客户端。这可能会导致您的应用程序在进行大型查询时使用大量内存。使用服务器端游标，一次返回一点结果（默认为 2000 条记录）。有关最终参考，请参阅psycopg2 文档。

笔记
要使用服务器端（或命名）游标，您必须使用PostgresqlExtDatabase.

要使用服务器端游标执行查询，只需使用ServerSide()帮助程序包装您的选择查询：

large_query = PageView.select()  # Build query normally.

# Iterate over large query inside a transaction.
for page_view in ServerSide(large_query):
    # do some interesting analysis here.
    pass

# Server-side resources are released.

如果您希望所有SELECT查询自动使用服务器端游标，您可以在创建时指定PostgresqlExtDatabase：

from postgres_ext import PostgresqlExtDatabase

ss_db = PostgresqlExtDatabase('my_db', server_side_cursors=True)

笔记
服务器端游标的生存时间与事务一样长，因此 peewee 不会commit()在执行SELECT 查询后自动调用。如果您commit在完成迭代后不这样做，您将不会释放服务器端资源，直到连接关闭（或事务稍后提交）。此外，由于 peewee 默认会缓存游标返回的行，因此您应该.iterator() 在迭代大型查询时始终调用。
如果您使用ServerSide()帮助程序，事务和调用iterator()将被透明地处理。

13.6.6 全文搜索 Full-text search

Postgresql使用特殊的数据类型（和）提供复杂的全文搜索。文档应存储或转换为类型，搜索查询应转换为 .tsvectortsquerytsvectortsquery

对于简单的情况，您可以简单地使用该Match()函数，它将自动执行适当的转换，并且不需要更改架构：

def blog_search(search_term):
    return Blog.select().where(
        (Blog.status == Blog.STATUS_PUBLISHED) &
        Match(Blog.content, search_term))

该Match()函数将自动将左侧操作数转换为 a tsvector，将右侧操作数转换为 a tsquery。为了获得更好的性能，建议您GIN在计划搜索的列上创建索引：

CREATE INDEX blog_full_text_search ON blog USING gin(to_tsvector(content));

或者，您可以使用TSVectorField来维护用于存储tsvector数据的专用列：

class Blog(Model):
    content = TextField()
    search_content = TSVectorField()

笔记
TSVectorField, 将自动使用 GIN 索引创建。

tsvector在插入或更新search_content字段时，您需要将传入的文本数据显式转换为：

content = 'Excellent blog post about peewee ORM.'
blog_entry = Blog.create(
    content=content,
    search_content=fn.to_tsvector(content))

要执行全文搜索，请使用TSVectorField.match()：

terms = 'python & (sqlite | postgres)'
results = Blog.select().where(Blog.search_content.match(terms))

有关详细信息，请参阅Postgres 全文搜索文档。

13.6.7 postgres_ext API 说明

class PostgresqlExtDatabase(database[, server_side_cursors=False[, register_hstore=False[, …]]])
与支持相同PostgresqlDatabase但需要：

参数：

database ( str ) – 要连接的数据库的名称。
server_side_cursors ( bool ) --SELECT查询是否应该使用服务器端游标。
register_hstore ( bool ) – 向连接注册 HStore 扩展。
服务器端游标
ArrayField
DateTimeTZField
JSONField
BinaryJSONField
HStoreField
TSVectorField
如果您希望使用 HStore 扩展，则必须指定register_hstore=True.

如果使用server_side_cursors，还请务必使用包装您的查询 ServerSide()。

ServerSide

ServerSide(select_query)

参数：选择查询– 一个SelectQuery实例。
Rtype 生成器：

将给定的选择查询包装在事务中，并调用其 iterator()方法以避免缓存行实例。为了释放服务器端资源，请务必耗尽生成器（遍历所有行）。

用法：

large_query = PageView.select()
for page_view in ServerSide(large_query):
    # Do something interesting.
    pass

# At this point server side resources are released.

class ArrayField

class ArrayField([field_class=IntegerField[, field_kwargs=None[, dimensions=1[, convert_values=False]]]])

参数：

field_class – 的子类Field，例如IntegerField。
field_kwargs ( dict ) – 要初始化的参数field_class。
dimensions ( int ) – 数组的维度。
convert_values ( bool ) – 将field_class值转换应用于数组数据。
能够存储提供的field_class数组的字段。

笔记
默认情况下 ArrayField 将使用 GIN 索引。要禁用此功能，请使用初始化字段index=False。

您可以存储和检索列表（或列表列表）：

class BlogPost(BaseModel):
    content = TextField()
    tags = ArrayField(CharField)


post = BlogPost(content='awesome', tags=['foo', 'bar', 'baz'])

此外，您可以使用__getitem__API 来查询数据库中的值或切片：

# Get the first tag on a given blog post.
first_tag = (BlogPost
             .select(BlogPost.tags[0].alias('first_tag'))
             .where(BlogPost.id == 1)
             .dicts()
             .get())

# first_tag = {'first_tag': 'foo'}

获取切片值：

# Get the first two tags.
two_tags = (BlogPost
            .select(BlogPost.tags[:2].alias('two'))
            .dicts()
            .get())
# two_tags = {'two': ['foo', 'bar']}

contains（*items）

参数：项目– 必须在给定数组字段中的一项或多项。

# Get all blog posts that are tagged with both "python" and "django".
Blog.select().where(Blog.tags.contains('python', 'django'))

contains_any（*items）

参数：项目– 在给定的数组字段中搜索一个或多个项目。
Like contains(), except 将匹配数组包含任何给定项目的行。

# Get all blog posts that are tagged with "flask" and/or "django".
Blog.select().where(Blog.tags.contains_any('flask', 'django'))

class DateTimeTZField

class DateTimeTZField( *args , **kwargs )

DateTimeField的时区感知子类。

class HStoreField

class HStoreField( *args , **kwargs )

用于存储和检索任意键/值对的字段。有关使用的详细信息，请参阅hstore 支持。

注意
要使用它，HStoreField您需要确保 hstore扩展已注册到连接。为此，请实例化PostgresqlExtDatabasewith register_hstore=True。

笔记
默认情况下HStoreField将使用GiST索引。要禁用此功能，请使用初始化字段index=False。

keys()

返回给定行的键。

>>> for h in House.select(House.address, House.features.keys().alias('keys')):
...     print(h.address, h.keys)

123 Main St [u'bath', u'garage']

values()

返回给定行的值。

>>> for h in House.select(House.address, House.features.values().alias('vals')):
...     print(h.address, h.vals)

123 Main St [u'2 bath', u'2 cars']

items()

像 python 一样dict，返回列表列表中的键和值：

>>> for h in House.select(House.address, House.features.items().alias('mtx')):
...     print(h.address, h.mtx)

123 Main St [[u'bath', u'2 bath'], [u'garage', u'2 cars']]

slice( *args )

返回给定键列表的数据片段。

>>> for h in House.select(House.address, House.features.slice('garage').alias('garage_data')):
...     print(h.address, h.garage_data)

123 Main St {'garage': '2 cars'}

exists(key)

查询给定键是否存在。

>>> for h in House.select(House.address, House.features.exists('garage').alias('has_garage')):
...     print(h.address, h.has_garage)

123 Main St True

>>> for h in House.select().where(House.features.exists('garage')):
...     print(h.address, h.features['garage']) # <-- just houses w/garage data

123 Main St 2 cars

defined(key)

查询给定键是否有与之关联的值。

update( **data)

对给定行的键/值执行原子更新。

>>> query = House.update(features=House.features.update(
...     sqft=2000,
...     year_built=2012))
>>> query.where(House.id == 1).execute()

delete（*keys）

删除给定行或行提供的键。

笔记
我们将使用UPDATE查询。

>>> query = House.update(features=House.features.delete(
...     'sqft', 'year_built'))
>>> query.where(House.id == 1).execute()

contains（value）

参数： value– a dict、 a listof 键或单个键。
查询行是否存在：

部分字典。
键列表。
一个键。

>>> query = House.select()
>>> has_garage = query.where(House.features.contains('garage'))
>>> garage_bath = query.where(House.features.contains(['garage', 'bath']))
>>> twocar = query.where(House.features.contains({'garage': '2 cars'}))

contains_any（*keys）

参数：钥匙– 一个或多个要搜索的键。
查询行是否存在任何键。

class JSONField

class JSONField（dumps=None，*args，**kwargs ）

参数：转储– 默认是调用 json.dumps() 或 dumps 函数。您可以覆盖此方法以创建自定义的 JSON 包装器。
适合存储和查询任意 JSON 的字段类。在模型上使用它时，将字段的值设置为 Python 对象（ a dict或 a list）。当您从数据库中检索值时，它将作为 Python 数据结构返回。

笔记
您必须使用 Postgres 9.2 / psycopg2 2.5 或更高版本。

笔记
如果您使用的是 Postgres 9.4，强烈考虑 BinaryJSONField改用它，因为它提供了更好的性能和更强大的查询选项。

示例模型声明：

db = PostgresqlExtDatabase('my_db')

class APIResponse(Model):
    url = CharField()
    response = JSONField()

    class Meta:
        database = db

存储 JSON 数据的示例：

url = 'http://foo.com/api/resource/'
resp = json.loads(urllib2.urlopen(url).read())
APIResponse.create(url=url, response=resp)

APIResponse.create(url='http://foo.com/baz/', response={'key': 'value'})

要查询，请使用 Python 的[]运算符来指定嵌套键或数组查找：

APIResponse.select().where(
    APIResponse.response['key1']['nested-key'] == 'some-value')

为了说明[]运算符的使用，假设我们将以下数据存储在中APIResponse：

{
  "foo": {
    "bar": ["i1", "i2", "i3"],
    "baz": {
      "huey": "mickey",
      "peewee": "nugget"
    }
  }
}

以下是一些查询的结果：

def get_data(expression):
    # Helper function to just retrieve the results of a
    # particular expression.
    query = (APIResponse
             .select(expression.alias('my_data'))
             .dicts()
             .get())
    return query['my_data']

# Accessing the foo -> bar subkey will return a JSON
# representation of the list.
get_data(APIResponse.data['foo']['bar'])
# '["i1", "i2", "i3"]'

# In order to retrieve this list as a Python list,
# we will call .as_json() on the expression.
get_data(APIResponse.data['foo']['bar'].as_json())
# ['i1', 'i2', 'i3']

# Similarly, accessing the foo -> baz subkey will
# return a JSON representation of the dictionary.
get_data(APIResponse.data['foo']['baz'])
# '{"huey": "mickey", "peewee": "nugget"}'

# Again, calling .as_json() will return an actual
# python dictionary.
get_data(APIResponse.data['foo']['baz'].as_json())
# {'huey': 'mickey', 'peewee': 'nugget'}

# When dealing with simple values, either way works as
# you expect.
get_data(APIResponse.data['foo']['bar'][0
						
							上一篇：							
								微服务技术栈 SpringCloud+RabbitMQ+Docker+Redis+Search+Distributed (V)：分布式搜索 ES-medium							
						
						
							下一篇：							
								Python 的列表生成器、生成器


															
						
							推荐阅读						
						
														
								
									
										Python 100 天 - Web 开发 - Peewee] 第 284 天 - Peewee 的扩展 (3) postgreSQL