[B! computerVision] manboubirdのブックマーク

manboubird id:manboubird

computerVisionに関するmanboubirdのブックマーク (454)

Data Labeling Best Practices - Video | MLOps Community
manboubird 2024/03/17
Data Labeling Best Practices

mlOps

annotation

dataLabeling

computerVision

llm

video

generativeAi

knowledgeGraph
リンク
GPTが人知れず既存の名刺管理アプリを抹殺していた話 - Qiita
抹殺は言い過ぎかもしれませんが簡易な名刺管理アプリであれば自作で十分という時代がきていたようですこれで紙の名刺からはきっとバイバイできるでしょう！名刺管理アプリ作ってほしいといわれたそれは2/22のお話。ことの発端は別の部署からかかってきた一本の電話でした。新規事業の部署でいろいろな取引先様と付き合いがあるものの、紙の名刺が非常に多く管理に困っているとのことのことです。私は小売業に勤務しているしがない一社員で、現在Eコマースの戦略立案に関する部署に所属しています。電話先の方は、以前一緒の部署で勤務したことがある方です。現在新規事業のプロジェクト推進をしており、冒頭のような課題感を持っているため既存の名刺管理アプリ導入を考えたのですが、あまりのお値段の高さに卒倒して私に藁をもすがる思いで連絡されたようです。これまでのアプリは名刺の識別専門のAI（）を使っていた話を聞いてみた
manboubird 2024/02/27
chatGpt

fastapi

computerVision

ocr
リンク
Google Cloud、エンタープライズ向け生成 AI の利用を拡大 | Google Cloud 公式ブログ
※この投稿は米国時間 2023 年 7 月 19 日に、Google Cloud blog に投稿されたものの抄訳です。生成 AI の開発は猛烈な勢いで続いており、この破壊的技術を活かせるようなエンタープライズ向け機能を利用できるかどうかがかつてなく重要となっています。 Google の数十年に及ぶ調査とイノベーション、そして AI への投資を活用することで、Google Cloud は常に、セキュリティとデータガバナンス、そして全体的なスケーラビリティが用意されている状態で生成 AI を利用できるようにしています。この目的のため、先月 Google は Vertex AI での生成 AI サポートの一般提供を発表しました。これにより Google Research から優れた基盤モデルにアクセス可能となり、これらのモデルをカスタマイズして利用するためのツールも利用できるようになりま
manboubird 2023/07/30
google

vertexAi

embeddings

googleCloudPlatform

generativeAi

multimodal

computerVision
リンク
Toru Tamaki
manboubird 2023/07/29
resercher

computerVision
リンク
Image Segmentation
manboubird 2023/07/08
computerVision

semanticSegmentation

huggingface
リンク
https://kotarotanahashi-cvpr-app-1t7vny.streamlit.app/
manboubird 2023/06/15
search

paper

llm

cvpr

computerVision
リンク
Andrej Karpathy
manboubird 2023/05/13
tesla

openAi

artificialIntelligence

machineLearning

computerVision

researcher
リンク
NLPとVision-and-Languageの基礎・最新動向 (2) / DEIM Tutorial Part 2 Vision-and-Language
DEIM2023 第15回データ工学と情報マネジメントに関するフォーラムチュートリアル講演資料 Part2: Vision-and-Language
manboubird 2023/03/08
slide

deepLearning

nlp

computerVision
リンク
NLPとVision-and-Languageの基礎・最新動向 (1) / DEIM Tutorial Part 1: NLP
DEIM2023 第15回データ工学と情報マネジメントに関するフォーラムチュートリアル講演資料 Part1: NLP
manboubird 2023/03/08
nlp

computerVision

slide

deepLearning
リンク
自然言語処理とVision-and-Language / A Tutorial on NLP & Vision-and-Language
2022年度人工知能学会全国大会（第36回）チュートリアル講演資料
manboubird 2023/02/12
slide

ntt

vision

nlp

computerVision

deepLearning

clip
リンク
Google Research, 2022 & beyond: Language, vision and generative models
Language Models The progress on larger and more powerful language models has been one of the most exciting areas of machine learning (ML) research over the last decade. Important advances along the way have included new approaches like sequence-to-sequence learning and our development of the Transf ormer model, which underlies most of the advances in this space in the last few years. Although langu
manboubird 2023/01/24
nlp

computerVision

generativeAi

google

transformers
リンク
GitHub - Yutong-Zhou-cv/Awesome-Text-to-Image: (ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Text to Face👨🏻🧒👧🏼🧓🏽 (arXiv preprint 2024) [💬 3D] Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior, Yiqian Wu et al. [Paper] (CVPR 2024) CosmicMan: A Text-to-Image Foundation Model for Humans, Shikai Li et al. [Paper] [Project] (arXiv preprint 2024) Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping an
manboubird 2023/01/22
textToImage

links

generativeAi

nlp

computerVision

paper

stableDiffusion

transformers

textToSketch
リンク
https://dl.acm.org/doi/pdf/10.1145/3505244
manboubird 2023/01/22
Transformers in Vision: A Survey

paper

transformers

computerVision

survey

deepLearning
リンク
Learning Transferable Visual Models From Natural Language Supervision
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstr
manboubird 2023/01/22
openAi

clip

paper

gpt3

llm

nlp

computerVision
リンク
【論文解説】自然言語処理と画像処理の融合 - OpenAI 『CLIP』を理解する
今回はOpenAIの『CLIP(Contrastive Language-Image Pre-training)』を解説したいと思います。 CLIPは画像の分類に利用されるモデルですが、今までのモデルと何が違うかというと、自然言語処理の技術を応用する点です。一般的な画像分類では、たくさんの画像を用意して、それぞれ対して犬、猫、リンゴ、などのラベルをつけます。それを教師データとして学習します。しかしながら、その方法には以下のような問題点があります。ラベル付けに非常にコストがかかる。ラベルの種類が限定的で、学習対象の種類についてはうまく分類できるが、初めて見る対象(例えば、犬と猫を学習して、果物を分類するなど)については分類精度が低い。 CLIPでは、こういった問題に取り組んでいきます。ちなみに、CLIPはモデルの仕組みではなく事前学習方法ですので、モデル自体はResNetやVisi
manboubird 2023/01/21
clip

openAi

transformers

computerVision

nlp
リンク
みずほリサーチ&テクノロジーズ： Swin Transformerの手法概要紹介（1）―TransformerとVision Transformer―
Swin Transf ormerのベースとなった手法として、Transf ormerとVision Transf ormerの2つが挙げられる。Transf ormerは自然言語処理分野で提案された手法であり、これを画像認識分野に応用したものがVision Transf ormerである。この2つの手法について紹介する。 Transf ormerが提案される前の2010年、Mikolovら[4]により、時系列データの予測を目的とするネットワーク構造であるRNN（Recurrent Neural Network、再帰型ニューラルネットワーク）が提案された。文章中の単語の並びを時系列のデータの並びと捉え、自然言語処理にRNNを適用する試みがされる中、以下の課題が指摘された。（1）ある単語の処理を終えるまで、次の単語の処理を開始できないため、並列化が困難（2）単語の並びを逐次的に処理するにしたがい、以
manboubird 2022/10/30
transformer

segmentation

objectDetection

computerVision

visionTransformer

swinTransformer
リンク
GitHub - mczhuge/Kaleido-BERT: 💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain. (CVPR2021)
manboubird 2022/10/30
kaleidoBert

bert

deepLearning

fashion

computerVision

nlp
リンク
2022.9.7 NAIST DSCサマーセミナー「Vision and Language技術の最新動向」
NAIST DSCサマーセミナー2022の発表資料です。 http://www-dsc.naist.jp/dsc_naist/naist-dsc-summer-seminar-2022/
manboubird 2022/09/09
computerVision

slide
リンク
Visual Relationship Detection with Language Priors
Visual relationships capture a wide variety of interactions between pairs of objects in images (e.g. "man riding bicycle" and "man pushing bicycle"). Consequently, the set of possible relationships is extremely large and it is difficult to obtain sufficient training examples for all possible relationships. Because of this limitation, previous work on visual relationship detection has concentrated
manboubird 2022/07/31
Visual Relationship Detection with Language Priors

sceneGraph

paper

knowledgeGraph

computerVision

stanford
リンク
https://datacentricai.org/neurips21/papers/118_CameraReady_DataCentricWorkshop2021_CR_Final.pdf
manboubird 2022/07/30
Fantastic Data and How to Query Them

paper

sceneGraph

knowledgeGraph

computerVision

sparql
リンク
1 2 3 4 5 6 7 8 9 10 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx