MyIaaS

Unlock the Power of AI

Elevate your AI projects with MyIaaS. Our cutting-edge AI compute services are available enabling high-performance AI processing & efficient AI model inference ondemand.

Why Choose Us?

Affordable Pricing: Starting at only £0.90 GBP / $1.22 USD per hour.
High Performance: Optimised for fast AI inference.
Scalability: Easily scale for any project size.
Eco-friendly: 100% renewable energy.

WISECORP AI baremetal node

CISC compute: AMD or Intel 32/64 core,
RAM: 256GB,
Storage: 8TB NVMe on PCIe x16 (4x2TB),
Inference: Five to ten 32GB RAM (x16 or x8 PCIe) AI compute cards,
& more: Access to SaN, Infiniband fabric.

Get Started Today!

We have very prudent pricing on enterprise grade AI compute all in-house & wholly owned,
furthermore we can offer HPC & Beowulf CISC/RISC and/or KNL compute,
Contact me at [here] and join businesses leveraging AI for innovation.

AI Software Frameworks

Software Framework	Description
TensorFlow	Open-source platform for machine learning.
PyTorch	Flexible deep learning framework.
Caffe	Deep learning framework focused on speed.
MXNet	Scalable deep learning framework for efficient training.
Chainer	Flexible framework for deep learning that allows for dynamic neural network construction.
ONNX	Open Neural Network Exchange format for model interoperability.
Scikit-learn	Machine learning library for Python with efficient tools for data mining.
FastAI	High-level library built on top of PyTorch for simplifying deep learning.
Apache Spark MLlib	Scalable machine learning library for big data processing.
H2O.ai	Open-source platform for AI and machine learning with automatic machine learning capabilities.
DL4J (DeepLearning4J)	Open-source deep learning library for Java and Scala.
and more.

Model family (ref: GPT5 7~13B)	Typical parameter sizes	Approx. full-precision (FP32) checkpoint size	Typical quantized size (INT8 / 4-bit)
LLaMA 2 7B	~7B	~28–30 GB	~4–8 GB
LLaMA 2 13B	~13B	~52–56 GB	~8–12 GB
LLaMA 2 70B	~70B	~280–300 GB	~30–60 GB
LLaMA (original) 7B	~7B	~28–30 GB	~4–8 GB
LLaMA (original) 13B	~13B	~52–56 GB	~8–12 GB
GPT-J 6B	~6B	~24 GB	~3–6 GB
GPT-NeoX 20B	~20B	~80 GB	~10–20 GB
GPT-NeoX 20B (Sharded)	~20B	~80 GB	~10–20 GB
GPT-2 XL (1.5B)	~1.5B	~6 GB	~1–2 GB
OPT 1.3B	~1.3B	~5–6 GB	~1–2 GB
OPT 6.7B	~6.7B	~26–28 GB	~4–7 GB
OPT 30B	~30B	~120 GB	~12–25 GB
MPT-7B	~7B	~28–30 GB	~4–8 GB
MPT-30B	~30B	~120 GB	~12–25 GB
Falcon 7B	~7B	~28–30 GB	~4–8 GB
Falcon 40B	~40B	~160 GB	~20–40 GB
Falcon 180B	~180B	~720 GB	~70–160 GB (heavy quant)
Qwen-7B	~7B	~28–30 GB	~4–8 GB
Qwen-14B	~14B	~56–60 GB	~9–15 GB
Claude-style / Anthropic 52B (open variants)	~52B	~208 GB	~20–45 GB
RWKV 7B	~7B	~28–30 GB	~4–8 GB
RWKV 14B	~14B	~56–60 GB	~9–15 GB
StableLM 3B	~3B	~12 GB	~1.5–3 GB
StableLM 7B	~7B	~28–30 GB	~4–8 GB
EleutherAI Pythia 1B	~1B	~4 GB	~0.5–1 GB
EleutherAI Pythia 6.9B	~6.9B	~28 GB	~4–7 GB
T5 Large (Llama-like encoder-decoder) 770M	~0.77B	~3 GB	~0.5–1 GB
T5 3B	~3B	~12 GB	~1.5–3 GB
UL2 20B (research)	~20B	~80 GB	~10–20 GB
DeBERTa / BERT Large	~0.34B–0.4B	~1.5–1.6 GB	~0.3–0.6 GB
Jurassic-1 / Jurassic-X (open variants)	~20B–40B	~80–160 GB	~10–30 GB
XGen 7B	~7B	~28–30 GB	~4–8 GB
Vicuna 7B (fine-tuned LLaMA)	~7B	~28–30 GB (base + LoRA)	~4–8 GB
Alpaca 7B	~7B	~28–30 GB	~4–8 GB
Mixtral 8x7B (~8x7B fused)	~64–70B (varies)	~256–280 GB	~25–60 GB
mT5 / multilingual variants	~3B–11B	~12–44 GB	~2–8 GB
Custom PyTorch model (params P)	P params	~4 × P bytes — e.g., 10B → ~40 GB	~0.5–1.0 × FP32 size when quantized
TinyLLM / distilled smalls (1B–3B)	~1–3B	~4–12 GB	~0.5–3 GB

Application Areas	Description
Image Classification	Models like MobileNet and EfficientNet can classify images in real-time on devices with limited computational resources, such as smartphones and IoT devices.
Object Detection	Frameworks like YOLO (You Only Look Once) and SSD (Single Shot MultiBox Detector) can perform real-time object detection in videos and images, making them suitable for applications in surveillance, autonomous vehicles, and robotics.
Natural Language Processing (NLP)	Models like BERT and DistilBERT can be quantised for tasks such as sentiment analysis, text classification, and chatbots, allowing for efficient processing on edge devices.
Speech Recognition	Quantised models can be used in voice assistants and speech-to-text applications, enabling real-time processing with lower latency and reduced resource consumption.
Recommendation Systems	Quantised precision can be applied in recommendation algorithms to quickly process user data and provide personalised content suggestions in applications like e-commerce and streaming services.
Facial Recognition	Systems that require fast and efficient facial recognition can utilise models to perform real-time identification and verification in security and access control applications.
*R (AR, MR, VR & XR)	Models can be used in *R applications for real-time object recognition and tracking, enhancing user experiences in gaming and interactive environments.
Anomaly Detection	In industrial applications, models can be deployed for real-time monitoring and anomaly detection in manufacturing processes, helping to identify defects or equipment failures quickly.
Healthcare Diagnostics	Quantised models can assist in medical imaging analysis, such as detecting tumours in X-rays or MRIs, providing faster results in clinical settings.