Tensor: परिभाषा और अर्थ — AI विकी

संख्याओं का एक multidimensional array — deep learning में मूलभूत data structure। एक scalar 0D tensor (एक single number) है। एक vector 1D tensor है। एक matrix 2D tensor है। एक image 3D tensor (height × width × channels) है। Images का batch 4D tensor है। Model weights, activations, gradients — neural network में सब कुछ एक tensor है।

यह क्यों मायने रखता है

Tensors deep learning की भाषा हैं। PyTorch, TensorFlow, और JAX मूल रूप से tensor computation libraries हैं। Tensor shapes और operations को समझना model code पढ़ने, shape mismatches (ML code में सबसे सामान्य error) debug करने, और neural networks के अंदर क्या होता है समझने के लिए आवश्यक है। यदि आप tensor shapes follow कर सकते हैं, तो आप architecture follow कर सकते हैं।

गहन अध्ययन

NLP में सामान्य tensor shapes: input tokens (batch_size, sequence_length) integers हैं। Embeddings (batch_size, seq_len, model_dim) floats हैं। Attention weights (batch_size, num_heads, seq_len, seq_len) हैं। Output logits (batch_size, seq_len, vocab_size) हैं। इन shapes को समझना आपको बताता है कि वास्तव में क्या हो रहा है: attention tensor N×N है क्योंकि प्रत्येक token हर दूसरे token पर attend करता है।

Operations

मुख्य tensor operations: matmul (matrix multiplication — neural networks में core computation), reshape (data बदले बिना dimensions बदलना), transpose (dimensions swap करना), concat (एक dimension के साथ tensors जोड़ना), slice (subtensors निकालना), और broadcast (element-wise operations के लिए differently-shaped tensors को compatible बनाना)। Deep learning वास्तव में tensors पर लागू इन operations का एक sequence है।

GPU Acceleration

Tensors GPUs पर compute किए जाते हैं क्योंकि tensor operations massively parallel हैं: दो matrices को multiply करने में लाखों स्वतंत्र multiply-add operations शामिल हैं जो simultaneously चल सकते हैं। यही कारण है कि GPU VRAM मायने रखता है — computation में शामिल सभी tensors GPU memory में रहने चाहिए। जब आप VRAM से बाहर हो जाते हैं, तो ऐसा इसलिए है क्योंकि सभी tensor sizes का योग (model weights + activations + gradients + optimizer states) capacity से अधिक है। Gradient checkpointing, mixed precision, और model sharding जैसी techniques सभी tensor memory manage करने के बारे में हैं।

Tensor

यह क्यों मायने रखता है

गहन अध्ययन

Operations

GPU Acceleration

संबंधित अवधारणाएँ