Questions we presented to a Senior Scientist at OpenAI during a conference. Feel free to review them. #105
Unanswered
BruceYanghy
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Q1: Large Model Hyperparameter Tuning Strategy: How can you efficiently search and adjust hyperparameters for large models given their extensive hyperparameter space? Are there best practices or things to be cautious of?
Answer: There's no special method for hyperparameter tuning. It mostly relies on personal experience.
Q2: Interpreting Large Models: Large models are often seen as "black boxes." Does OpenAI have effective methods or tools to enhance the interpretability of these large models?
Answer: Currently, there's no method to increase the interpretability of large models. It's kind of a False proposition
Q3: Hallucination in Large Models: When developing large models, how can we identify and guard against "illusions/hallucinations" – i.e., outputs that seem logical but are misleading or false? Does OpenAI have systematic methods or standard procedures to assess and mitigate these risks?
Answer: To reduce Hallucination in large models, improvements can be made at the data layer, and memorization training can be done during the fine-tuning phase.
Q4: Customized vs. Universal Large Models: Will specialized large models become obsolete if universal models, like future GPT-5, GPT-6, or GPT-7, outperform in specific domains (e.g., finance)?
Answer: OpenAI has no plans for specialized models, but specialized models are promising, especially in industries with private or unique data like finance.
Q5: Large Model Inference Quantization: How to improve inference results without compromising quality?
Answer: For model weight quantization specifics, refer to the GPTQ paper.
Q6: Model Quantization: Why does int4 quantization reduce model size significantly but has lower inference efficiency than int8? Are there any solutions?
Answer: There are two aspects of model quantization. First, using low-precision hardware for computation and second, reducing the memory bandwidth required for model throughput. If the int4 performance on a chip is not as good as int8, consider storing the model in int4 and converting to int8 for computations.
Q7: Open-Sourcing Inference Acceleration Frameworks: Are there future plans to open source the inference acceleration framework related to ChatGPT?
Answer: It's believed that OpenAI will not open-source the model inference acceleration framework.
Q8: Controlling Large Model Input Parameters: How to effectively control the input configurations of large models and evaluate the generated results?
Answer: It relies on personal experience.
Q9: Safety of Large Model Responses: How can the safety of responses from large models be ensured? And how can online inference latency be reduced?
Answer: There's no specific method to improve the safety of large model responses. The solution is to feed the model with targeted data collection.
Q10 Future of Large Model Training: Will there be new distributed training methods beyond tensor parallelism and 3D parallelism?
Answer: The future direction of distributed training seems to be automation, replacing manual strategies with automated software that allocates resources based on cluster configurations.
Q11: Handling Failures during Training: How to address single point failures during training and quickly take over tasks from failed nodes?
Answer: Restart and find a machine without faults.
Q12 Vector Databases: Is there a real demand for vector databases? Some like AutoGPT seem to be moving away from them.
Answer: Vector databases might be needed in specialized models or enterprise-scale models
.
Q13: Crowdsourcing Feedback for Open-source Large Language Models: Would OpenAI be willing to create a platform for this?
Answer: No plans at the moment.
Q14: Using RAG vs. Finetuning in llm: When to use Retrieval-Augmented Generation and when to fine-tune?
Answer: There's no fixed strategy. They aren't mutually exclusive. For instance, fine-tuning specialized models can enhance results, but it doesn't mean RAG isn't needed.
Q15: Evaluation Methods at OpenAI: What are the evaluation methods used at OpenAI for their models?
Answer: There are general methods as well as proprietary ones which are not disclosed. but there are Open-source LLMs leaderboards
Q16: Future of Autonomous Systems like AutoGPT: What's the trend?
Answer: Autonomous agents have a promising future, but it might take some time.
Q17: Large Models Emulating Celebrities: How to train a model to emulate a celebrity's experiences and statements?
Answer: Digital avatars of celebrities can be made. Pre-training isn't required separately. The model can be trained during the fine-tuning phase using the celebrity's data.
Q18: 2024 Predictions: If everyone is training large models in 2023, does it mean that inference will be the focus in 2024?
Answer: Every year, different people or companies have different focuses. In 2023, both training and inference are being worked on.
Q19: Breakthroughs in 2024 for Large Models: What might they be?
Answer: Multimodal approaches, especially during the pre-training phase.
Q20: Opinion on AMD's MI300 Series Graphics Card: How do they perform?
Answer: The new AMD MI300 card is believed to achieve 60-70% of the performance of an equivalent NVIDIA card. For large model developers, using NVIDIA's GPUs means they can run models directly, but using AMD's GPU might require months of adaptation.
Beta Was this translation helpful? Give feedback.
All reactions