Skip to content

Commit

Permalink
feat: add exponential backoff to outgoing requests (#22)
Browse files Browse the repository at this point in the history
* Implement basic retry mechanism

* e2e configuration, further implement retry

* Fix problems during migration

* Just use retry built into requests 🤦

* Use better Retry params

* Add custom backoff options to examples

* Refactor to share session

* Give up on testing retry itself

* Improve README, address bug

* Add missing paragraph

* Correct param names in README

* Remove unneeded import in tests

* Remove time_multiple, improve request params

* Update src/push_api_clientpy/platformclient.py

Make change non-breaking

Co-authored-by: Benjamin Taillon <[email protected]>

* Fix UT

---------

Co-authored-by: Benjamin Taillon <[email protected]>
  • Loading branch information
dmbrooke and btaillon-coveo authored Sep 19, 2023
1 parent 0cd88ea commit 8e7a179
Show file tree
Hide file tree
Showing 8 changed files with 75 additions and 28 deletions.
27 changes: 25 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ Usage

See more examples in the ``./samples`` folder.

.. code-block:: python
.. code-block:: python
from push_api_clientpy import Source, DocumentBuilder
source = Source("my_api_key", "my_org_id")
myDocument = DocumentBuilder("https://my.document.uri", "My document title")\
Expand All @@ -30,6 +30,29 @@ See more examples in the ``./samples`` folder.
print(f"Document added: {response.json()}")
Exponential backoff retry configuration
=======================================

By default, the SDK leverages an exponential backoff retry mechanism. Exponential backoff allows for the SDK to make multiple attempts to resolve throttled requests, increasing the amount of time to wait for each subsequent attempt. Outgoing requests will retry when a `429` status code is returned from the platform.

The exponential backoff parameters are as follows:

* `retry_after` - The amount of time, in seconds, to wait between throttled request attempts.

Optional, will default to 5.

* `max_retries` - The maximum number of times to retry throttled requests.

Optional, will default to 10.

You may configure the exponential backoff that will be applied to all outgoing requests. To do so, specify a `BackoffOptions` object when creating either a `Source` or `PlatformClient` object:

.. code-block:: python
source = Source("my_api_key", "my_org_id", BackoffOptions(3, 10))
By default, requests will retry a maximum of 10 times, waiting 10 seconds after the second attempt, with a time multiple of 2 (which will equate to a maximum execution time of roughly 1.5 hours. See `urllib3 Retry documentation <https://urllib3.readthedocs.io/en/2.0.4/reference/urllib3.util.html#urllib3.util.Retry>`_).

Dev
===

Expand Down
4 changes: 2 additions & 2 deletions samples/pushbatchofdocuments.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from push_api_clientpy import Source, DocumentBuilder, BatchUpdate
from push_api_clientpy import Source, DocumentBuilder, BatchUpdate, BackoffOptions
from push_api_clientpy.platformclient import BatchDelete

source = Source("my_api_key", "my_org_id")
source = Source("my_api_key", "my_org_id", BackoffOptions(3, 10, 3))


myBatchOfDocuments = BatchUpdate(addOrUpdate=[], delete=[])
Expand Down
4 changes: 2 additions & 2 deletions samples/pushdocumentwithmetadata.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from push_api_clientpy import Source, DocumentBuilder
from push_api_clientpy import Source, DocumentBuilder, BackoffOptions

source = Source("my_api_key", "my_org_id")
source = Source("my_api_key", "my_org_id", BackoffOptions(max_retries=3))

myDocument = DocumentBuilder("https://my.document.uri", "My document title")\
.withAuthor("bob")\
Expand Down
4 changes: 2 additions & 2 deletions samples/pushonedocument.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from push_api_clientpy import Source, DocumentBuilder
from push_api_clientpy import Source, DocumentBuilder, BackoffOptions

source = Source("my_api_key", "my_org_id")
source = Source("my_api_key", "my_org_id", backoff_options=BackoffOptions(3, 5))

myDocument = DocumentBuilder("https://my.document.uri", "My document title")\
.withData("these words will be searchable")
Expand Down
2 changes: 1 addition & 1 deletion src/push_api_clientpy/documentbuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def withData(self, data: str):
def withCompressedBinaryData(self, data: str, compressionType: CompressionType):
self.document.compressedBinaryData.compressionType = compressionType
self.document.compressedBinaryData.data = data
return self
return self

def withDate(self, date: Union[str, int, datetime]):
self.document.date = self.__validateDateAndReturnValidDate(date)
Expand Down
43 changes: 30 additions & 13 deletions src/push_api_clientpy/platformclient.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
from .document import Document, SecurityIdentityType
from dataclasses import asdict, dataclass
from typing import Literal, TypedDict
from typing import Literal
import requests
from requests.adapters import HTTPAdapter, Retry
import json

SourceVisibility = Literal["PRIVATE", "SECURED", "SHARED"]
DEFAULT_RETRY_AFTER = 5
DEFAULT_MAX_RETRIES = 50


@dataclass
Expand Down Expand Up @@ -103,11 +106,25 @@ class BatchUpdateDocuments:
addOrUpdate: list[Document]
delete: list[BatchDelete]

@dataclass
class BackoffOptions:
retry_after: int = DEFAULT_RETRY_AFTER
max_retries: int = DEFAULT_MAX_RETRIES


class PlatformClient:
def __init__(self, apikey: str, organizationid: str):
def __init__(self, apikey: str, organizationid: str, backoff_options: BackoffOptions = BackoffOptions(), session = requests.Session()):
self.apikey = apikey
self.organizationid = organizationid
self.backoff_options = backoff_options

self.retries = Retry(total=self.backoff_options.max_retries,
backoff_factor=self.backoff_options.retry_after,
status_forcelist=[429],
respect_retry_after_header=False
)
session.mount('https://', HTTPAdapter(max_retries=self.retries))
self.session = session

def createSource(self, name: str, sourceVisibility: SourceVisibility):
data = {
Expand All @@ -118,65 +135,65 @@ def createSource(self, name: str, sourceVisibility: SourceVisibility):
}
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = self.__baseSourceURL()
return requests.post(url, json=data, headers=headers)
return self.session.post(url, json=data, headers=headers)

def createOrUpdateSecurityIdentity(self, securityProviderId: str, securityIdentityModel: SecurityIdentityModel):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__baseProviderURL(securityProviderId)}/permissions'
return requests.put(url, json=securityIdentityModel.toJSON(), headers=headers)
return self.session.put(url, json=securityIdentityModel.toJSON(), headers=headers)

def createOrUpdateSecurityIdentityAlias(self, securityProviderId: str, securityIdentityAlias: SecurityIdentityAliasModel):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__baseProviderURL(securityProviderId)}/mappings'
return requests.put(url, json=securityIdentityAlias.toJSON(), headers=headers)
return self.session.put(url, json=securityIdentityAlias.toJSON(), headers=headers)

def deleteSecurityIdentity(self, securityProviderId: str, securityIdentityToDelete: SecurityIdentityDelete):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__baseProviderURL(securityProviderId)}/permissions'
return requests.delete(url, json=securityIdentityToDelete.toJSON(), headers=headers)
return self.session.delete(url, json=securityIdentityToDelete.toJSON(), headers=headers)

def deleteOldSecurityIdentities(self, securityProviderId: str, batchDelete: SecurityIdentityDeleteOptions):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__baseProviderURL(securityProviderId)}/permissions/olderthan'
queryParams = {"orderingId": batchDelete.OrderingID,
"queueDelay": batchDelete.QueueDelay}
return requests.delete(url, params=queryParams, headers=headers)
return self.session.delete(url, params=queryParams, headers=headers)

def manageSecurityIdentities(self, securityProviderId: str, batchConfig: SecurityIdentityBatchConfig):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__baseProviderURL(securityProviderId)}/permissions/batch'
queryParams = {"fileId": batchConfig.FileID,
"orderingId": batchConfig.OrderingID}
return requests.put(url, params=queryParams, headers=headers)
return self.session.put(url, params=queryParams, headers=headers)

def pushDocument(self, sourceId: str, doc):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__basePushURL()}/sources/{sourceId}/documents'
queryParams = {"documentId": doc["documentId"]}
return requests.put(url, headers=headers, data=json.dumps(doc), params=queryParams)
return self.session.put(url, headers=headers, data=json.dumps(doc), params=queryParams)

def deleteDocument(self, sourceId: str, documentId: str, deleteChildren: bool):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__basePushURL()}/sources/{sourceId}/documents'
queryParams = {"deleteChildren": str(
deleteChildren).lower(), "documentId": documentId}
return requests.delete(url, headers=headers, params=queryParams)
return self.session.delete(url, headers=headers, params=queryParams)

def createFileContainer(self):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__basePushURL()}/files'
return requests.post(url, headers=headers)
return self.session.post(url, headers=headers)

def uploadContentToFileContainer(self, fileContainer: FileContainer, content: BatchUpdateDocuments):
headers = fileContainer.requiredHeaders
url = fileContainer.uploadUri
return requests.put(url, json=asdict(content), headers=headers)
return self.session.put(url, json=asdict(content), headers=headers)

def pushFileContainerContent(self, sourceId: str, fileContainer: FileContainer):
headers = self.__authorizationHeader() | self.__contentTypeApplicationJSONHeader()
url = f'{self.__basePushURL()}/sources/{sourceId}/documents/batch'
queryParams = {"fileId": fileContainer.fileId}
return requests.put(url=url, params=queryParams, headers=headers)
return self.session.put(url=url, params=queryParams, headers=headers)

def __basePushURL(self):
return f'https://api.cloud.coveo.com/push/v1/organizations/{self.organizationid}'
Expand Down
7 changes: 4 additions & 3 deletions src/push_api_clientpy/source.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .platformclient import BatchUpdateDocuments, FileContainer, PlatformClient, SecurityIdentityAliasModel, SecurityIdentityBatchConfig, SecurityIdentityDelete, SecurityIdentityDeleteOptions, SecurityIdentityModel, SourceVisibility
from .platformclient import BatchUpdateDocuments, FileContainer, PlatformClient, SecurityIdentityAliasModel, SecurityIdentityBatchConfig, SecurityIdentityDelete, SecurityIdentityDeleteOptions, SecurityIdentityModel, SourceVisibility, DEFAULT_MAX_RETRIES, DEFAULT_RETRY_AFTER
from .platformclient import BackoffOptions
from .documentbuilder import DocumentBuilder
from dataclasses import asdict, dataclass
import json
Expand All @@ -10,8 +11,8 @@ class BatchUpdate(BatchUpdateDocuments):


class Source:
def __init__(self, apikey: str, organizationid: str):
self.client = PlatformClient(apikey, organizationid)
def __init__(self, apikey: str, organizationid: str, backoff_options: BackoffOptions = BackoffOptions()):
self.client = PlatformClient(apikey, organizationid, backoff_options)

def create(self, name: str, visibility: SourceVisibility):
return self.client.createSource(name, visibility)
Expand Down
12 changes: 9 additions & 3 deletions tests/test_platformclient.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
import pytest
import requests
from push_api_clientpy import IdentityModel, PlatformClient, SecurityIdentityModel, SecurityIdentityAliasModel, AliasMapping, SecurityIdentityDelete, DocumentBuilder, BatchDelete, BatchUpdateDocuments, FileContainer, SecurityIdentityBatchConfig
from push_api_clientpy import IdentityModel, PlatformClient, SecurityIdentityModel, SecurityIdentityAliasModel, AliasMapping, SecurityIdentityDelete, DocumentBuilder, BatchDelete, BatchUpdateDocuments, FileContainer, SecurityIdentityBatchConfig, BackoffOptions


@pytest.fixture
def client():
return PlatformClient("my_key", "my_org")
return PlatformClient("my_key", "my_org", BackoffOptions(retry_after=100))


@pytest.fixture
Expand Down Expand Up @@ -41,7 +41,6 @@ def fileContainer():
def doc():
return DocumentBuilder("http://foo.com", "the_title").marshal()


def assertAuthHeader(adapter):
lastRequestHeaders = adapter.last_request.headers
assert lastRequestHeaders.get("Authorization") == "Bearer my_key"
Expand Down Expand Up @@ -177,3 +176,10 @@ def testDeleteDocument(self, client, requests_mock):

assertAuthHeader(adapter)
assertContentTypeHeaders(adapter)

def testRetryMechanismOptions(self):
new_client = PlatformClient("my_key", "my_org", BackoffOptions(retry_after=100, max_retries=10))

retry = new_client.retries
assert retry.total == 10
assert retry.backoff_factor == 100

0 comments on commit 8e7a179

Please sign in to comment.