pykoi.chat.llm package#

Submodules#

pykoi.chat.llm.abs_llm module#

Module for the Abstract LLM model class.

class pykoi.chat.llm.abs_llm.AbsLlm[source]#

Bases: ABC

Abstract LLM class.

This class is an abstract base class (ABC) for LLM classes. It ensures that all subclasses implement the predict method.

None#
property name#

Return the name of the model.

This method must be implemented by any subclass of AbsLlm.

Raises:

NotImplementedError – This method must be implemented by subclasses.

abstract predict(message: str, num_of_response: int)[source]#

Predict the next word based on the input message.

This method must be implemented by any subclass of AbsLlm.

Parameters:
  • message (str) – The input message used to predict the next word.

  • num_of_response (int) – How many completions to generate for each prompt.

Raises:

NotImplementedError – This method must be implemented by subclasses.

pykoi.chat.llm.constants module#

Constants for the LLM

class pykoi.chat.llm.constants.ModelSource(value)[source]#

Bases: Enum

This class is an enumeration for the available model source.

HUGGINGFACE = 'huggingface'#
OPENAI = 'openai'#
PEFT_HUGGINGFACE = 'peft_huggingface'#

pykoi.chat.llm.huggingface module#

Huggingface model for Language Model (LLM).

class pykoi.chat.llm.huggingface.HuggingfaceModel(pretrained_model_name_or_path: str, name: str | None = None, trust_remote_code: bool = True, load_in_8bit: bool = True, max_length: int = 100, device_map: str = 'auto')[source]#

Bases: AbsLlm

This class is a wrapper for the Huggingface model for Language Model (LLM) Chain. It inherits from the abstract base class AbsLlm.

classmethod create(model, tokenizer, name=None, max_length=100)[source]#

Initialize the Huggingface model with given model and tokenizer.

Parameters:
  • model – Pre-loaded model instance from Huggingface.

  • tokenizer – Pre-loaded tokenizer instance from Huggingface.

  • name (str) – The name of the model

  • max_length (int) – The maximum length for the model. Default is 100.

Returns:

An instance of HuggingfaceModel.

model_source = 'huggingface'#
property name#

Return the name of the model.

This method must be implemented by any subclass of AbsLlm.

Raises:

NotImplementedError – This method must be implemented by subclasses.

predict(message: str, num_of_response: int = 1)[source]#

Predict the next word based on the input message.

Parameters:
  • message (str) – The input message for the model.

  • num_of_response (int) – The number of response to generate. Default is 1.

Returns:

List of response.

Return type:

List[str]

pykoi.chat.llm.instruct_pipeline module#

class pykoi.chat.llm.instruct_pipeline.InstructionTextGenerationPipeline(*args, do_sample: bool = True, max_new_tokens: int = 256, top_p: float = 0.92, top_k: int = 0, **kwargs)[source]#

Bases: Pipeline

postprocess(model_outputs, response_key_token_id, end_key_token_id, return_full_text: bool = False)[source]#

Postprocess will receive the raw outputs of the _forward method, generally tensors, and reformat them into something more friendly. Generally it will output a list or a dict or results (containing just strings and numbers).

preprocess(instruction_text, **generate_kwargs)[source]#

Preprocess will take the input_ of a specific pipeline and return a dictionary of everything necessary for _forward to run properly. It should contain at least one tensor, but might have arbitrary other items.

pykoi.chat.llm.instruct_pipeline.get_special_token_id(tokenizer: PreTrainedTokenizer, key: str) int[source]#

Gets the token ID for a given string that has been added to the tokenizer as a special token.

When training, we configure the tokenizer so that the sequences like “### Instruction:” and “### End” are treated specially and converted to a single, new token. This retrieves the token ID each of these keys map to.

Parameters:
  • tokenizer (PreTrainedTokenizer) – the tokenizer

  • key (str) – the key to convert to a single token

Raises:

RuntimeError – if more than one ID was generated

Returns:

the token ID for the given key

Return type:

int

pykoi.chat.llm.model_factory module#

This module defines a factory for creating language models.

class pykoi.chat.llm.model_factory.ModelFactory[source]#

Bases: object

A factory class for creating language models.

This class provides a static method create_model which creates a language model instance based on the given name.

static create_model(model_source: str | ModelSource, **kwargs) AbsLlm[source]#

Create a language model based on the given name.

This method tries to match the given model name with the names defined in the ModelSource enumeration. If a match is found, it creates an instance of the corresponding language model. If no match is found, it raises a ValueError.

Parameters:

model_source (Union[str, ModelSource]) – The name of the language model source.

Returns:

An instance of the language model.

Return type:

AbsLlm

Raises:

ValueError – If the given model name is not valid.

pykoi.chat.llm.openai module#

This module provides a wrapper for the OpenAI model.

class pykoi.chat.llm.openai.OpenAIModel(api_key: str, name: str | None = None, engine: str = 'davinci', max_tokens: int = 100, temperature: float = 0.5)[source]#

Bases: AbsLlm

A class that wraps the OpenAI model for use in the LLMChain.

_engine#

The engine to use for the OpenAI model.

Type:

str

_max_tokens#

The maximum number of tokens to generate.

Type:

int

_temperature#

The temperature to use for the OpenAI model.

Type:

float

__init__(self, api_key

str, engine: str, max_tokens: int, temperature: float): Initializes the OpenAI model.

predict(self, message

str): Predicts the next word based on the given message.

model_source = 'openai'#
property name#

Return the name of the model.

This method must be implemented by any subclass of AbsLlm.

Raises:

NotImplementedError – This method must be implemented by subclasses.

predict(message: str, num_of_response: int = 1)[source]#

Predicts the next word based on the given message.

Parameters:
  • message (str) – The message to base the prediction on.

  • num_of_response (int) – How many completions to generate for each prompt. Defaults to 1.

Returns:

List of response.

Return type:

List[str]

pykoi.chat.llm.peft_huggingface module#

Huggingface PEFT model for Language Model (LLM).

class pykoi.chat.llm.peft_huggingface.PeftHuggingfacemodel(base_model_path: str, lora_model_path: str, name: str | None = None, trust_remote_code: bool = True, load_in_8bit: bool = True, max_length: int = 100, device_map: str = 'auto')[source]#

Bases: AbsLlm

This class is a wrapper for the Huggingface PEFT model for Language Model (LLM).

_model#

The PEFT model.

Type:

PeftModel

_tokenizer#

The tokenizer for the model.

Type:

AutoTokenizer

_max_length#

The maximum length of the generated text.

Type:

int

model_source = 'peft_huggingface'#
property name#

Return the name of the model.

This method must be implemented by any subclass of AbsLlm.

Raises:

NotImplementedError – This method must be implemented by subclasses.

predict(message: str, num_of_response: int = 1)[source]#

Predict the next word based on the input message.

Parameters:
  • message (str) – The input message for the model.

  • num_of_response (int, optional) – The number of response to generate. Default is 1.

Returns:

List of response.

Return type:

List[str]

Module contents#