experiments,

Building an app to observe the screen

Gerald Haslhofer Gerald Haslhofer Follow Dec 16, 2020 · 1 min read
Building an app to observe the screen
Share this

Deploy app: https://docs.microsoft.com/en-us/dotnet/core/deploying/

Create docker container with python: https://medium.com/@abdul.the.coder/how-to-deploy-a-containerised-python-web-application-with-docker-flask-and-uwsgi-8862a08bd5df

Demo with spacy: https://www.geeksforgeeks.org/python-named-entity-recognition-ner-using-spacy/

Classify categorieson Google Cloud

  • https://cloud.google.com/natural-language/docs/classify-text-tutorial
  • https://github.com/GoogleCloudPlatform/dotnet-docs-samples/tree/master/language/api/Analyze

Looking at layout LM

Installing detectron (from facebookresearch)

Installation:

  • https://detectron2.readthedocs.io/tutorials/install.html
  • python -m pip install ‘git+https://github.com/facebookresearch/detectron2.git’

Detectron FB collab https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=ZyAvNCJMmvFF

Install layourparser

https://github.com/Layout-Parser/layout-parser

Deep layoutparser example

https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb

Largely unsuccessful (though I did get it to run on Google colab)

Deploying sentence embedding to the cloud

https://docs.microsoft.com/en-us/azure/developer/python/tutorial-deploy-app-service-on-linux-01

Use visual fingerprinting to identify which regions are changing

https://realpython.com/fingerprinting-images-for-near-duplicate-detection/ https://github.com/JohannesBuchner/imagehash https://pypi.org/project/ImageHash/

(using motion estimation to find the difference between frames) There are two mainstream techniques of motion estimation: pel-recursive algorithm (PRA) and block-matching algorithm (BMA). PRAs are iterative refining of motion estimation for individual pels by gradient methods. BMAs assume that all the pels within a block has the same motion activity. BMAs estimate motion on the basis of rectangular blocks and produce one motion vector for each block. PRAs involve more computational complexity and less regularity, so they are difficult to realize in hardware. In general, BMAs are more suitable for a simple hardware realization because of their regularity and simplicity.

https://github.com/Breakthrough/DVR-Scan

imageassets

https://www.microsoft.com/en-us/research/blog/vinvl-advancing-the-state-of-the-art-for-vision-language-models/

transparent window

Create a new winforms app: https://stackoverflow.com/questions/56530534/how-can-i-create-a-windows-forms-project-using-net-core

Create winforms in dotnet core

https://executecommands.com/dotnet-core-winforms-application/

Creating tasks

https://docs.microsoft.com/en-us/graph/api/todotasklist-post-tasks?view=graph-rest-1.0&tabs=http

New way of streaming video screen?

https://flashphoner.com/screensharing-from-ffmpeg-to-webrtc/ https://github.com/unosquare/ffmediaelement https://trac.ffmpeg.org/wiki/Capture/Desktop

capture active screen

  • just use coordinate of active screen?

https://stackoverflow.com/questions/1163761/capture-screenshot-of-active-window https://code-examples.net/en/q/11c1f1

Use UWP APIs https://blogs.windows.com/windowsdeveloper/2019/09/16/new-ways-to-do-screen-capture/

Copy texture https://gamedev.stackexchange.com/questions/136016/sharpdx-texture-to-image-in-c

https://github.com/Microsoft/Windows.UI.Composition-Win32-Samples/tree/master/dotnet/WPF/ScreenCapture

Save Direct2d -> bmp https://stackoverflow.com/questions/21243270/saving-a-texture2d-using-sharpdx-when-targeting-win8-1-metro