From laptop to the Cloud: Deploying AI model on Hugging Face Hub & Spaces
/
From laptop to the Cloud: Deploying AI model on Hugging Face Hub & Spaces
AI models often remain stuck in a research phase. AI researchers, data scientists, and data engineers think too little about the practical implementation of an AI model.
Fortunately, better tools are already being made available to use AI models in practice. HuggingFace released its first Transformers library years ago, allowing complex LLM models based on Transformers to be used. This removed a lot of complexity, and HuggingFace thus set a standard for the AI world.
Main section
Quick facts
/
Launched in 2016
/
Web-apps, datasets, models
/
Transformers library
/
PyTorch, TensorFlow & JAX
What can you do with Hugging Face?
HuggingFace? What is that?
Hugging Face is an open, community-driven platform and model hub for machine learning that allows you to discover, share, and host AI models in a single unified environment. Thanks to the Hub, you can publish models for various frameworks, from PyTorch and TensorFlow to JAX and pure Keras. You benefit from version control and model information (via so-called Cards) to make your work reproducible and reusable. Thus, the Hub serves not only as a kind of "GitHub for AI models" but also as a central place where you can easily search by tags, performance, and architecture. In addition to the model registry, Hugging Face also offers Spaces: ready-made web apps (usually built with Gradio or Streamlit) where you can directly demonstrate your models or deploy them as an interactive service. Without having to set up a cloud environment yourself, you can set up a Space in a few minutes, define the necessary dependencies in a `requirements.txt` or `YAML` frontmatter methodology, and test live via a browser interface. Moreover, the platform also supports a dataset hub: you can upload datasets (including labeling, splitting, and SQL queries), which are automatically converted to efficient Parquet formats, and share them with the community. This saves you the complexity of setup and management, allowing you to focus on designing and refining your AI workflows.
Managing AI models
AI models developed by AI engineers are increasingly being treated as a common.Code AssetA piece of code that is part of a larger project. Code often changes, and an AI model does that at the beginning of development as well. Researchers try to adjust different parameters and eventually achieve a working result. If they treat this like other code, it is usually placed on a Version Control platform. GitHub specializes in managing code, just as SharePoint does for business files.
HuggingFace collaborates with the underlying technology of Git, thus building a version control system for AI models. These models can be easily distributed to other developers and users in this way. HuggingFace also ensures that an AI Engineer adds extra information to the trained model, so there is transparency for other users. The developers provide aModel Cardwith snippets of example code, but also statistics about the AI model and its quality. They also add aLicensewho decides whether the AI model can be used commercially or not.
HuggingFace CLI tool
HuggingFace has made a CLI tool available, making tasks very easy to automate.
Once you have created an account, you will need to obtain an authorization code, which the CLI will use to access your account. You can choose which permissions are associated with this code. Then you can use the CLI to aRepoofRepositorycreate, so that the model can be managed in the version control system of theHugging Face Hub.
pip install huggingface_hub
huggingface-cli login
huggingface-cli repo create <your-username>/<repo-name> --type model
Framework independent
AI models are often created using 3 well-known frameworks: Pytorch, Tensorflow, or Keras. There are not many standards yet, except for ONNX, that are framework-independent.
Keras and TensorFlow are similar in their operation, as Keras has been integrated with TensorFlow since newer versions. However, Keras uses a `.keras` file format with its own structure. This does not automatically comply with the structure required by Hugging Face Hub. Hugging Face employs a standardized folder structure and metadata (such as config.json, metadata.json, and files that store the weights of the neurons in HDF5 or PyTorch format) so that other tools can load the model without additional modifications. Therefore, an explicit conversion step is needed to make the model "Hugging Face–ready."
Conversation step: from .keras to unpacked SavedModel
Below you see an example of how to convert a pure Keras file into an unpacked SavedModel structure that does meet the HF requirements:

After execution, the following folder structure is created under unpacked_keras/:
unpacked_keras/
├── config.json
├── metadata.json
model.weights.h5
This structure can now be placed directly on the Hugging Face Hub in a repository. Keras and HuggingFace also allow for this conversion to be done automatically and pushed to the Repository:

If the repository does not exist yet, it will be created automatically.
The standard result will look like this:

Through the web platform, you now receive ready-made code that allows you to download and reuse your AI model.

Additional data and versions can always be added to the Repository in the same way.
AI model deployment
Because AI developers and researchers do not always have the expertise to create a complete web environment in which the AI model can be used, HuggingFace also allows trained AI models to be easily integrated into custom applications. Those applications can then be developed by a specialized team.
HuggingFace, however, allows us to easily make AI models available in a so-called Space via Gradio or Streamlit. These frameworks, built in Python, are based on underlying JavaScript technologies to render browser environments powerful and efficient. This way, a developer does not need to learn complex JavaScript frameworks to create a simple-looking yet responsive web application.
What is Gradio?
Gradio is an open-source Python library that allows you to build interactive web interfaces around your machine learning models in just a few lines of code. Thanks to Gradio, you establish the foundation of your application according to the classic Input → Processing → Output pattern, while automatically generating a FastAPI endpoint under the hood. This way, you benefit from an easy way to provide both visual interfaces and a REST API, without having to set up a web server or API framework yourself.
Thanks to the seamless support for Gradio within HuggingFace, you can get started very quickly: as soon as you put your Gradio code in aSpacethe platform automatically publishes and containerizes your Python environment and dependencies. With the free version, you can quickly debug and test prototype interfaces, after which you can easily launch public demos. Moreover, you can create and update Spaces via the Hugging Face CLI. This is perfect for CI/CD workflows or full automation, without having to manually make settings in the Hub.
The way to set up a Space is similar to that of a Repository. However, we will now have a substantive difference. The file structure contains at least an `app.py` and a `requirements.txt`, and there can also be a `README.md` file to provide information for the Space page on HuggingFace. The requirements are necessary to install all additional Python packages.
ai-model-space/
├── app.py
├── README.md
└── requirements.txt
In our code for Gradio, we will see that HuggingFace chooses to download the latest version of the AI model within the Gradio Space each time. This ensures that the interface and the AI model can be developed separately, and there is no need for joint updates. However, it does cause a slight additional delay when starting up the Gradio Space, depending on the size of the AI model.

The result of this Gradio application can be seen in the Space as soon as you start it. The application classifies images of Pandas, Cats, and Dogs. Below is an example.

Dataset management on Hugging Face Hub
Hugging Face not only offers a model hub but also a full-fledged platform to create, manage, and share datasets. Just like with models, each dataset is stored in a Git repository, with larger files automatically handled via Git LFS. This guarantees complete version control and collaboration within teams, without having to worry about complex cloud systems or complicated data pipelines.
After uploading your data, such as images for a classification model, you can directly label them in the Hub, split them into train, validation, and test subsets, and even search through them with simple SQL-like queries. Behind the scenes, Hugging Face automatically converts the files to the `Parquet` format, which not only saves storage space but also enables fast, column-oriented queries and analyses. This way, you have all the tools at your fingertips to easily deploy your dataset in a reproducible and scalable manner, as well as to provide transparency.
Bottom section
Transparency
AI model deployments are no longer a secret, but they are very important for generating revenue. Moreover, by using AI models, they can be improved with new insights.
An important point is and remains transparency. Informing a user that an AI model is being used, and being able to explain what decision was made, remains crucial. HuggingFace also plays a significant role in this, through the transparency of the model information and the dataset data.
Do you want to know more about the transparency of AI models? Then definitely keep an eye on this blog, and feel free to contact us for more information!
